From: Marc Lavine (mlavine@foundrynet.com)
Date: 09/12/02
Hello Peter,
Thanks for the response, and sorry for the long delay in getting back to you.
>Peter Phaal wrote:
>
>It made sense to split the datagram document into two parts: the first
>document is a description of the overall datagram format, and the second
>contains detailed structure definitions for standard flow and counter data.
Yes, I think separating the document into two parts was a good idea.
> I changed the flow data so that it is now represented as a list of
>flow_record structures. Each flow_record provides type, length and value
>information so that fields can be skipped easily.
>
> I also changed the counter sample to represent counters as a list of
>counter blocks, each with a type, length and value. This would allow a
>collector to ignore counter blocks that it didn't understand. It also makes
>it easy for agent developers to add new counters.
The approach that you took is a bit different from the way I had thought of doing things, but these changes seem fine to me.
I think there's an additional change that is worth making as well, which would be to add length information for each overall sample. This would provide for the possibility of extending each sample structure (e.g. if it's determined that a new field is needed within a flow_sample -- see my explanation below regarding this type of extensibility), and it would also provide the ability to add new sample types while maintaining backward compatibility. So, what I propose might look something like this:
struct sample_record {
sample_types sample_type; /* Specifies the type of sample data */
opaque sample_data<>; /* A sample structure, such as flow_sample or counters_sample */
}
struct sample_datagram_v5 {
... /* same as before */
sample_record samples<>; /* An array of sample_records */
}
Note that I haven't used the "enterprise" extension mechanism here, although you could if you want to keep it consistent with the other structures.
> > Adding length fields to structures as I've described actually provides
> > for two different types of extensibility.
> > ...
> > The second type of extensibility involves being able to extend the
> > length of an already-existing structure in a way that need not break
> > compatibility with collectors which understand an older version of
> > that structure. As an example, let's say that a new field is
> > identified that should belong in the extended_gateway structure.
> > Rather than define a new extended data type just for this new field,
> > it could simply be added to the end of the existing structure. Since
> > the structure would include a length field, a collector can process
> > the portion of the structure that it knows about, and skip over the
> > rest.
>
> I don't think it is a good idea to allow existing structures to be
>extended. Different agent implementors might extend the structure in
>different, incompatible ways and a receiver of the data would not have
>enough information to determine how to decode the added fields.
Sorry if I wasn't clear enough about my intent here. I didn't mean to imply that an sFlow agent could extend an existing structure in any way it pleased. Rather, my intent was that the organization that had defined a particular structure could later define an extended version of that same structure with some additional fields added to the end. Existing fields in the structure would not be allowed to be modified, only new ones could be added to the end. The presence of the length information would allow a collector to determine which version of a structure was in use, so that it could take advantage of the additional fields if it had been updated to understand them. Collectors which did not understand the new fields would continue to use the earlier set of fields, and would skip over the new fields, because a properly implemented collector would always use the supplied length information to advance from one structure to the next (whether the structures were known to it or not). This should allow the incremental evolution of structures, where appropriate, without the overhead of introducing a new structure with its associated type information.
> > For a vendor to be able to assign their own identifiers, these need to
> > be constructed based on a globally-unique identifier. The two that
> > come to mind are an IEEE OUI (Organizationally Unique Identifier),
> > which appears as the first three bytes of a MAC address, and an SNMP
> > enterprise OID. ... As far as possible encodings, an OUI plus a single
> > additional byte would fit in a single XDR unsigned int, while an SNMP
> > OID would probably be reasonably compactly encoded as an XDR
> > variable-length opaque field containing the ASN.1-encoded OID.
>
> I think that the SNMP enterprise OID is the better candidate since any
>network management agent writer is likely to have one already.
I hadn't thought about just encoding the last part of the enterprise OID as a simple integer, since I guess I was thinking that in theory, the enterprise id could be multiple OID components, but I guess that's not likely to happen anytime soon. The way you've proposed to encode it is more compact, which is good. The OUI-based approach I suggested would be even more compact, using only a single integer for the combined OUI+format value. If one wished to gamble that they're not going to use up more than 24 bits worth or enterprise ids any time soon (they've used a little less than 1% of that range so far), then one could use 24 bits of enterprise id plus an 8-bit format id. Of course, either of these approaches would restrict one to 256 formats per global id, which feels sort of small, but I'm not sure we'd be likely to hit that limit, in practice.
I'm thinking that it might be good to make a distinction between standard and vendor extension structures. To do this, the standard structures could use enterprise id zero (which is a reserved id), rather than using InMon's id.
>PS I have also been looking at adding additional flow_record types. Included
>in the second document are additional flow records to report on MPLS and NAT
>data. Any suggestions on improving reporting in these areas are welcome.
I will consult with some of our experts in those areas and see if they have any comments.
I noticed the addition of the sampled_ethernet format. Since the data structures no longer use a union to restrict a flow sample to being represented by a single structure, I think there need to be some guidelines on how the sampled_* structures should be used. Should sampled_ethernet be provided along with sampled_ipv4? Should the other sampled_* structures not be included if sampled_header is provided? When should the sampled_ethernet and sampled_ip* structures be used?
Regards,
Marc
This archive was generated by hypermail 2.1.4 : 09/12/02 PDT