From: Peter Phaal (peter_phaal@inmon.com)
Date: 09/13/02
Marc,
I've made changes to the documents based on your recommendations. The latest
version are:
http://www.sflow.org/drafts/draft3/SFLOW-DATAGRAM.txt
http://www.sflow.org/drafts/draft3/SFLOW-STRUCTS.txt
I've also placed links to them from the sFlow.org documents page so that the
latest proposals will always be easy to find.
> To support this, I think it should be documented that any
> software decoding
> an sFlow packet, should always use the encoded length
> information, and not
> assume that a structure is of a particular length, since
> structures may
> grow. This should apply to standard structures as well,
> since presumably
> they could also be extended as a result of an update to the standard.
I added the following note to SFLOW-DATAGRAM.txt:
Note: sFlow implementors are permitted to extend structures at the end
without changing structure numbers. Any changes that would
alter or invalidate fields in published structure definitions
requires must be implemented using a new structure number. This
policy allows additional data to be added to structures while
still maintaining backward compatibility. Applications receiving
sFlow data must always use the opaque length information when
decoding opaque<> structures.
> You might want to explicitly specify whether data_format
> values need to be
> unique across different structure types. In other words, can
> a single value
> be used to identify a flow structure, a counter structure,
> and a sample
> structure, or should values not be reused in that manner?
I added this description to SFLOW-DATAGRAM.txt:
There are currently three opaque structures where which data_formats
are used:
1. sample_data
2. counter_data
3. flow_data
Structure format numbers may be re-used within each of these contexts.
For example, an (inmon,1) data_format could identify a particular
set of counters when used to describe counter_data, but refer to
a set of flow attributes when used to describe flow_data.
>
> You might consider renaming counter_block to counter_record
> for consistency
> with the other structures.
Done.
> In the data structures document, I think it would be good to have some
> documentation about how the structures should be filled in,
> particularly
> when not all information is available. For example, for
> extended_switch, if
> either the source or destination VLAN information is not
> available, should
> the corresponding fields be set to zero? Likewise for
> extended_user, I
> presume it's acceptable to encode a zero-length string if one
> of the user
> ids is not available.
How about this clarification in SFLOW-STRUCTS.txt?
The following values should be used for fields that are
unknown (unless otherwise indicated in the structure
definitions).
- Unknown integer value. Use a value of 0 to indicate that
a value is unknown.
- Unknown counter. Use the maximum counter value to indicate
that the counter is not available. Within any given sFlow
session a particular counter must be always available, or
always unavailable. An available counter may temporarily
have the max value just before it rolls to zero. This is
permitted. */
- Unknown string. Use the zero length empty string.
> In the extended_router documentation, it is not clearly
> specified whether
> the mask fields' format is a bit mask or a count of bits.
Modified the structure definition as follows:
struct extended_router {
address nexthop; /* IP address of next hop router */
unsigned int src_mask; /* Source address prefix mask
(expressed as number of bits) */
unsigned int dst_mask; /* Destination address prefix mask
(expressed as number of bits) */
}
> For flow_sample.drops, I think it would be good to clarify
> the documentation
> with regard to what kind of packets are being counted (i.e.
> are they only
> sFlow packet drops that are being counted?).
Added the following text to SFLOW-DATAGRAM.txt:
unsigned int drops; /* Number times a packet marked to be
sampled was dropped due to
lack of resources. A high drop rate
indicates that the management agent
is unable to process samples as fast
as they are being generated by
hardware. Increasing sampling_rate
will reduce the drop rate. */
> Should the ETHERNET-ISO8023 enum be named ETHERNET-ISO88023 instead?
Good catch.
> In flow_sample, the input and output fields have special
> values to represent
> the case where the interface is "unknown". If packets originating or
> terminating at the switch itself are sampled, then one of the
> two interface
> fields will not apply. I'm wondering if it might be good to have an
> additional special "none" value to indicate this, rather than
> using the
> "unknown" value, which might wind up getting used for other
> cases as well.
How about this change. It creates a third category that lets us capture the
reason that a packet was discarded. Can you think of other reason codes?
unsigned int output; /* SNMP ifIndex of output interface.
0 if interface is not known.
The most significant 2 bits are used
to indicate the format of the
30 bit value.
format = 0 single destination
interface, value is
ifIndex
of the interface.
format = 1 packet discarded, value
is
a reason code. Currently
the following codes are
defined.
0 = unknown
1 = ACL
2 = no buffer space
3 = RED
4 = no route to dest.
format = 2 multiple destination
interfaces, value is the
number of interfaces. A
value of 0 indicates an
unknown number greater
than 1.
Examples:
0x00000002 indicates ifIndex = 2
0x00000000 ifIndex unknown.
0x40000001 packet discarded
because
of ACL.
0x80000007 indicates a packet sent
to 7 interfaces.
0x80000000 indicates a packet sent
to an unknown number of
interfaces greater than
1. */
This additional information could be very useful for identifying
connectivity/performance problems.
> In the extended_user data, there is an issue of what character set and
> encoding the user ids are expressed in. I'm sure there will
> be contexts in
> which they will not be in ASCII. In an ideal world, I'd just
> say these
> should be encoded in UTF-8, but agents may receive the data
> in different
> encodings, and it seems better for the agents not to need to
> delve into
> character set translations. Therefore, I think it would be a
> good idea to
> be able to include information about the character set of
> each user id (for
> each field independently). This may assist a collector in
> being able to
> properly display the ids or map them into different character
> sets. For
> character set issues, see RFCs 2277 and 2978. RFC 2978
> defines a scheme for
> registering character sets and encodings (collectively dubbed
> "charsets").
> The registry contents can be found at
> http://www.iana.org/assignments/character-sets. Fortunately,
> the registry
> includes a "MIBenum" integer for each charset. I propose
> that these values
> be used to identify the charset for each user id string, with
> the reserved
> value zero being used to indicate that the charset is
> unknown. So, for
> example, if an agent knows that a user id is in UTF-8, the
> MIBenum value
> would be 106. UTF-8 could probably be considered the
> preferred charset, if
> the agent is able to obtain the data in different charsets.
Good suggestion. I impemented it in SFLOW-STRUCTS.txt as follows:
struct extended_user {
unsigned int charset; /* MIBEnum value of character set used to
encode
user information - See RFC 2978
Where possible UTF-8 encoding
(MIBEnum=106) should
be used. */
string src_user<>; /* User ID associated with packet source */
string dst_user<>; /* User ID associated with packet destination
*/
}
Regards,
Peter
----------------------
Peter Phaal
InMon Corp.
This archive was generated by hypermail 2.1.4 : 09/13/02 PDT