From: Marc Lavine (mlavine@foundrynet.com)
Date: 09/18/02
> I've also placed links to them from the sFlow.org documents page so that
the
> latest proposals will always be easy to find.
Good idea.
> I added the following note to SFLOW-DATAGRAM.txt:
>
> Note: sFlow implementors are permitted to extend structures at the end
> without changing structure numbers. Any changes that would
> alter or invalidate fields in published structure definitions
> requires must be implemented using a new structure number. This
> policy allows additional data to be added to structures while
> still maintaining backward compatibility. Applications receiving
> sFlow data must always use the opaque length information when
> decoding opaque<> structures.
Here's a suggested rewording to make the first sentence clearer about who is
allowed to extend a structure, and to make the last sentence a little more
explanatory:
Note: An enterprise which has defined sFlow structures is
permitted to extend those structure definitions at the end
without changing structure numbers. Any changes that would
alter or invalidate fields in published structure
definitions must be implemented using a new structure
number. This policy allows additional data to be added to
structures while still maintaining backward compatibility.
Applications receiving sFlow data must always use the
opaque length information when decoding opaque<> structures
so that encountering extended structures will not cause
decoding errors. Note that these rules apply to the
standard structures as well.
Also, on a related issue, do you think there should be a statement that some
enumerated values may be extended by a revision of the standard, without
changing the datagram version number? By this, I'm thinking, most notably,
that adding a new header_protocol value, should the need arise, should not
require creating datagram version 6.
> How about this clarification in SFLOW-STRUCTS.txt?
>
> The following values should be used for fields that are
> unknown (unless otherwise indicated in the structure
> definitions).
> - Unknown integer value. Use a value of 0 to indicate that
> a value is unknown.
> - Unknown counter. Use the maximum counter value to indicate
> that the counter is not available. Within any given sFlow
> session a particular counter must be always available, or
> always unavailable. An available counter may temporarily
> have the max value just before it rolls to zero. This is
> permitted. */
> - Unknown string. Use the zero length empty string.
Having an overall set of guidelines like this seems like a good idea. The
approach for dealing with unavailable counters seems reasonable.
> Added the following text to SFLOW-DATAGRAM.txt:
>
> unsigned int drops; /* Number times a packet marked to be
> sampled was dropped due to
> lack of resources. A high drop rate
> indicates that the management agent
> is unable to process samples as fast
> as they are being generated by
> hardware. Increasing sampling_rate
> will reduce the drop rate. */
It would probably be good to state explicitly whether this is the total
number of drops since the agent was initialized (I presume) or since the
previous sample. Also, does this field qualify for the special counter
behavior described above if it is not supported?
> > In flow_sample, the input and output fields have special
> > values to represent
> > the case where the interface is "unknown". If packets originating or
> > terminating at the switch itself are sampled, then one of the
> > two interface
> > fields will not apply. I'm wondering if it might be good to have an
> > additional special "none" value to indicate this, rather than
> > using the
> > "unknown" value, which might wind up getting used for other
> > cases as well.
>
> How about this change. It creates a third category that lets us capture
the
> reason that a packet was discarded. Can you think of other reason codes?
>
> unsigned int output; /* SNMP ifIndex of output interface.
> 0 if interface is not known.
> The most significant 2 bits are used
> to indicate the format of the
> 30 bit value.
> format = 0 single destination
> interface, value is
> ifIndex
> of the interface.
> format = 1 packet discarded, value
> is
> a reason code.
Currently
> the following codes are
> defined.
> 0 = unknown
> 1 = ACL
> 2 = no buffer space
> 3 = RED
> 4 = no route to dest.
> format = 2 multiple destination
> interfaces, value is
the
> number of interfaces. A
> value of 0 indicates an
> unknown number greater
> than 1.
>
> Examples:
> 0x00000002 indicates ifIndex = 2
> 0x00000000 ifIndex unknown.
> 0x40000001 packet discarded
> because
> of ACL.
> 0x80000007 indicates a packet
sent
> to 7 interfaces.
> 0x80000000 indicates a packet
sent
> to an unknown number
of
> interfaces greater
than
> 1. */
>
>
> This additional information could be very useful for identifying
> connectivity/performance problems.
This doesn't quite address the original case I was asking about, which is
that there could be an output port with no input port, or an input port with
no output port, and neither of these imply that the packet was dropped, but
rather that it was sent to or from the device itself (rather than through
the device). Some examples would be a routing protocol packet, such as a
RIP or BGP packet, or a spanning tree protocol BPDU, or a telnet packet,
etc. To describe this, I suggest that additional codes are needed in both
the input and output fields to indicate "no interface" (which I think should
be distinct from "unknown").
I do think that adding the type of information that you've sketched out
above is a good thing, in that it could allow for more completely
characterizing what's happening to the traffic. Placing the discard codes
into the output field, precludes capturing some of the information, however.
I know that with our switches, some features, such as ACLs can be applied to
traffic on input or output. For a packet discarded due to an output ACL, I
can imagine that one might want to be able to know which interface it would
have gone out through had it not been discarded. If you think that this is
useful and want to allow for the possibility of capturing that information
in the protocol, then it would seem to me that one could add an additional
field, perhaps called "disposition", that could describe whether the packet
was processed normally, or whether it was discarded at input or at output,
and the reason. Presumably, most packets aren't discarded on output, so if
one wished to not add extra overhead to the structure for every sample,
another option would be to define an extended data structure to hold the
additional data (e.g. which output port would have been used (assuming that
the agent implementation can support this, of course)).
Regarding the specific discard codes, I'd suggest adding something like
"rate limiting". Also, would "no route to dest." include not being able to
get an ARP response from a destination node on a local subnet? If so, then
perhaps we could just call it "destination unreachable".
> Good suggestion. I impemented it in SFLOW-STRUCTS.txt as follows:
>
> struct extended_user {
> unsigned int charset; /* MIBEnum value of character set used to
> encode
> user information - See RFC 2978
> Where possible UTF-8 encoding
> (MIBEnum=106) should
> be used. */
> string src_user<>; /* User ID associated with packet source */
> string dst_user<>; /* User ID associated with packet
destination
> */
> }
So as to not preclude any particular agent implementation, I think it would
be good to have charsets for the source and destination users specified
independently. This would allow for the possibility of the information
about the two users coming from different sources. Also, for this
particular case, it might be worthwhile to explicitly state that zero should
be specified for a charset if it is not known.
Regards,
Marc
This archive was generated by hypermail 2.1.4 : 09/18/02 PDT