Re: Re: sFlow Datagram Extensibility

From: Marc Lavine (mlavine@foundrynet.com)
Date: 09/18/02

  • Next message: Peter Phaal: "RE: Re: sFlow Datagram Extensibility"

    > I've also placed links to them from the sFlow.org documents page so that
    the
    > latest proposals will always be easy to find.

    Good idea.

    > I added the following note to SFLOW-DATAGRAM.txt:
    >
    > Note: sFlow implementors are permitted to extend structures at the end
    > without changing structure numbers. Any changes that would
    > alter or invalidate fields in published structure definitions
    > requires must be implemented using a new structure number. This
    > policy allows additional data to be added to structures while
    > still maintaining backward compatibility. Applications receiving
    > sFlow data must always use the opaque length information when
    > decoding opaque<> structures.

    Here's a suggested rewording to make the first sentence clearer about who is
    allowed to extend a structure, and to make the last sentence a little more
    explanatory:

         Note: An enterprise which has defined sFlow structures is
               permitted to extend those structure definitions at the end
               without changing structure numbers. Any changes that would
               alter or invalidate fields in published structure
               definitions must be implemented using a new structure
               number. This policy allows additional data to be added to
               structures while still maintaining backward compatibility.
               Applications receiving sFlow data must always use the
               opaque length information when decoding opaque<> structures
               so that encountering extended structures will not cause
               decoding errors. Note that these rules apply to the
               standard structures as well.

    Also, on a related issue, do you think there should be a statement that some
    enumerated values may be extended by a revision of the standard, without
    changing the datagram version number? By this, I'm thinking, most notably,
    that adding a new header_protocol value, should the need arise, should not
    require creating datagram version 6.

    > How about this clarification in SFLOW-STRUCTS.txt?
    >
    > The following values should be used for fields that are
    > unknown (unless otherwise indicated in the structure
    > definitions).
    > - Unknown integer value. Use a value of 0 to indicate that
    > a value is unknown.
    > - Unknown counter. Use the maximum counter value to indicate
    > that the counter is not available. Within any given sFlow
    > session a particular counter must be always available, or
    > always unavailable. An available counter may temporarily
    > have the max value just before it rolls to zero. This is
    > permitted. */
    > - Unknown string. Use the zero length empty string.

    Having an overall set of guidelines like this seems like a good idea. The
    approach for dealing with unavailable counters seems reasonable.

    > Added the following text to SFLOW-DATAGRAM.txt:
    >
    > unsigned int drops; /* Number times a packet marked to be
    > sampled was dropped due to
    > lack of resources. A high drop rate
    > indicates that the management agent
    > is unable to process samples as fast
    > as they are being generated by
    > hardware. Increasing sampling_rate
    > will reduce the drop rate. */

    It would probably be good to state explicitly whether this is the total
    number of drops since the agent was initialized (I presume) or since the
    previous sample. Also, does this field qualify for the special counter
    behavior described above if it is not supported?

    > > In flow_sample, the input and output fields have special
    > > values to represent
    > > the case where the interface is "unknown". If packets originating or
    > > terminating at the switch itself are sampled, then one of the
    > > two interface
    > > fields will not apply. I'm wondering if it might be good to have an
    > > additional special "none" value to indicate this, rather than
    > > using the
    > > "unknown" value, which might wind up getting used for other
    > > cases as well.
    >
    > How about this change. It creates a third category that lets us capture
    the
    > reason that a packet was discarded. Can you think of other reason codes?
    >
    > unsigned int output; /* SNMP ifIndex of output interface.
    > 0 if interface is not known.
    > The most significant 2 bits are used
    > to indicate the format of the
    > 30 bit value.
    > format = 0 single destination
    > interface, value is
    > ifIndex
    > of the interface.
    > format = 1 packet discarded, value
    > is
    > a reason code.
    Currently
    > the following codes are
    > defined.
    > 0 = unknown
    > 1 = ACL
    > 2 = no buffer space
    > 3 = RED
    > 4 = no route to dest.
    > format = 2 multiple destination
    > interfaces, value is
    the
    > number of interfaces. A
    > value of 0 indicates an
    > unknown number greater
    > than 1.
    >
    > Examples:
    > 0x00000002 indicates ifIndex = 2
    > 0x00000000 ifIndex unknown.
    > 0x40000001 packet discarded
    > because
    > of ACL.
    > 0x80000007 indicates a packet
    sent
    > to 7 interfaces.
    > 0x80000000 indicates a packet
    sent
    > to an unknown number
    of
    > interfaces greater
    than
    > 1. */
    >
    >
    > This additional information could be very useful for identifying
    > connectivity/performance problems.

    This doesn't quite address the original case I was asking about, which is
    that there could be an output port with no input port, or an input port with
    no output port, and neither of these imply that the packet was dropped, but
    rather that it was sent to or from the device itself (rather than through
    the device). Some examples would be a routing protocol packet, such as a
    RIP or BGP packet, or a spanning tree protocol BPDU, or a telnet packet,
    etc. To describe this, I suggest that additional codes are needed in both
    the input and output fields to indicate "no interface" (which I think should
    be distinct from "unknown").

    I do think that adding the type of information that you've sketched out
    above is a good thing, in that it could allow for more completely
    characterizing what's happening to the traffic. Placing the discard codes
    into the output field, precludes capturing some of the information, however.
    I know that with our switches, some features, such as ACLs can be applied to
    traffic on input or output. For a packet discarded due to an output ACL, I
    can imagine that one might want to be able to know which interface it would
    have gone out through had it not been discarded. If you think that this is
    useful and want to allow for the possibility of capturing that information
    in the protocol, then it would seem to me that one could add an additional
    field, perhaps called "disposition", that could describe whether the packet
    was processed normally, or whether it was discarded at input or at output,
    and the reason. Presumably, most packets aren't discarded on output, so if
    one wished to not add extra overhead to the structure for every sample,
    another option would be to define an extended data structure to hold the
    additional data (e.g. which output port would have been used (assuming that
    the agent implementation can support this, of course)).

    Regarding the specific discard codes, I'd suggest adding something like
    "rate limiting". Also, would "no route to dest." include not being able to
    get an ARP response from a destination node on a local subnet? If so, then
    perhaps we could just call it "destination unreachable".

    > Good suggestion. I impemented it in SFLOW-STRUCTS.txt as follows:
    >
    > struct extended_user {
    > unsigned int charset; /* MIBEnum value of character set used to
    > encode
    > user information - See RFC 2978
    > Where possible UTF-8 encoding
    > (MIBEnum=106) should
    > be used. */
    > string src_user<>; /* User ID associated with packet source */
    > string dst_user<>; /* User ID associated with packet
    destination
    > */
    > }

    So as to not preclude any particular agent implementation, I think it would
    be good to have charsets for the source and destination users specified
    independently. This would allow for the possibility of the information
    about the two users coming from different sources. Also, for this
    particular case, it might be worthwhile to explicitly state that zero should
    be specified for a charset if it is not known.

    Regards,
    Marc



    This archive was generated by hypermail 2.1.4 : 09/18/02 PDT