Using sFlow to analyze TCP connection statistics

From: Peter Phaal (Peter_Phaal@inmon.com)
Date: 05/03/02

  • Next message: Peter Phaal: "MPLS support in sFlow"

    Here is a script might be of interest to people. The script profiles TCP
    services producing the following results at five minute intervals:

    - time Time at the start of the 5 minute interval
    - agent sFlow agent generating the data
    - id sFlow sampling point within the agent (e.g. id = 0.1
    means ifIndex=1)
    - port Well known port used to identify the TCP service (e.g.
    www = web)
    - connections/s Average number of connections per second.
    - packets/s Average number of packets per second.
    - bytes/s Average number of bytes per second.
    - packets/connection Average number of packets per TCP connection.
    - bytes/connection Average number of bytes per TCP connection.

    The script is written in AWK and makes use of the sflowtool ( see
    http://www.sflow.org/software.htm ) to decode the sFlow records.

    The following is an example of the script output:

    [pp@traffic]$ sflowtool -p 8888 | ./tcpAnalysis
    time,agent,id,port,connections/s,packets/s,bytes/s,packets/connection,bytes/
    connection
    05/03/02 15:05,10.1.1.100,0:0,www,10.00,90.00,50803.33,9.00,5080.33
    05/03/02 15:05,10.1.1.100,0:0,smtp,26.58,3880.92,2421460.13,146.00,91095.12
    05/03/02 15:05,10.1.1.100,0:0,pop3,3.51,108.73,141045.70,31.00,40212.00

    The script demonstrates the aggregation and scaling sFlow data. Scaling by
    the total population of packets (samplePool) divided by the number of
    samples received ensures that any packet loss that occurs in transferring
    sFlow records from the router to the analysis application is compensated
    for. The alternative of scaling by the average skip count (meanSkipCount)
    underestimates traffic if sFlow records are lost.

    The connection oriented statistics (packets/connection and bytes/connection)
    are obtained by examining the TCP flags. A TCP connection is initiated by
    sending a packet with the SYN flag set. The script identifies these
    connection initiation packets, allowing it to compute connection rates. It
    also takes the ratio of the total number of TCP packets over the number of
    connection initiations to compute an average number of packets per
    connection.

    Peter

    ---------tcpAnalysis-------------
    #!/bin/awk -f
    #
    # Copyright (c) 2002 InMon Corp. ALL RIGHTS RESERVED
    #
    # This script is used in conjunction with the sflowtool
    # to analyze tcp connection statistics.
    #
    # Instructions:
    # 1. First configure the sFlow agent to send sFlow packets
    # to the test host on a specified port (8888 in this example).
    # 2. Start the test using the following command:
    # sflowtool -p 8888 | ./tcpAnalysis
    #

    BEGIN{
        # Enter the ports that you wish to profile
        ports[20] = "ftp-data";
        ports[21] = "ftp";
        ports[22] = "ssh";
        ports[23] = "telnet";
        ports[25] = "smtp";
        ports[80] = "www";
        ports[109] = "pop2";
        ports[110] = "pop3";
        ports[119] = "nntp";
        ports[220] = "imap3";
        ports[443] = "https";
        ports[513] = "login";

        # Log 5 minute averages
        intervalSeconds = 300;

        lastInt = 0;

        # Column headings
        print "time,agent,id,port,connections/s,packets/s,bytes/s,\
    packets/connection,bytes/connection";
    }
    /startDatagram/{}
    /unixSecondsUTC/{
      currentInt = $2 - ($2 % intervalSeconds);
      if(currentInt != lastInt) {
        printIntervalResults();
        lastInt = currentInt;
      }
    }
    /datagramVersion/{datagramVersion = $2;}
    /agent/{agent = $2;}
    /sampleSequenceNo/{sampleSequenceNo = $2;}
    /sourceId/{sourceId = $2;}
    /sampleType/{sampleType = $2;}
    /meanSkipCount/{meanSkipCount[agent,sourceId] = $2;}
    /samplePool/{
       samplePool[agent,sourceId] = $2;
       samples[agent,sourceId] += 1;
    }
    /dropEvents/{dropEvents[agent,sourceId] = $2;}
    /headerProtocol/{headerProtocol = $2;}
    /sampledPacketSize/{sampledPacketSize = $2;}
    /headerLen/{headerLen = $2;}
    /headerBytes/{headerBytes = $2;}
    /IPProtocol/{IPProtocol = $2;}
    /IPTOS/{IPTOS = $2;}
    /TCPSrcPort/{TCPSrcPort = $2;}
    /TCPDstPort/{TCPDstPort = $2;}
    /TCPFlags/{
        TCPFlags = $2;
        port = ports[TCPDstPort];
        if(!port) {port = ports[TCPSrcPort];}

        if(port) {
            if(TCPFlags == "2") {
                countSyns[agent,sourceId,port] += 1;
            }
            countPackets[agent,sourceId,port] += 1;
            countBytes[agent,sourceId,port] += sampledPacketSize;
        }
    }
    END{}

    function cdiff(c1,c2) {
      if(c1 == 0) return 0;
      if(c1 <= c2) return c2 - c1;
      if(c1 < 2**32) return c2 + 2**32 - c1;
      return c2 + 2**64 - c1;
    }

    function printIntervalResults() {

     # Time
     time = strftime("%D %R", lastInt);

     # TCP Stats Table
     for (key in countSyns) {
         n = split(key,a,SUBSEP);
         agent = a[1];
         id = a[2];
         port = a[3];
         syns = countSyns[key];
         packets = countPackets[key];
         bytes = countBytes[key];
         poolDelta = cdiff(oldSamplePool[agent,id],samplePool[agent,id]);
         samplesDelta = cdiff(oldSamples[agent,id],samples[agent,id]);
         if(samplesDelta > 0) {
             scale = poolDelta / samplesDelta;
             connectionsPerSec = syns * scale / intervalSeconds;
             packetsPerSec = packets * scale / intervalSeconds;
             bytesPerSec = bytes * scale / intervalSeconds;
             packetsPerConnection = packets / syns;
             bytesPerConnection = bytes / syns;
             printf "%s,%s,%s,%s,%.2f,%.2f,%.2f,%.2f,%.2f\n",\
    time,agent,id,port,connectionsPerSec,packetsPerSec,bytesPerSec,\
    packetsPerConnection,bytesPerConnection;
         }
     }
     delete countSyns;
     delete countPackets;
     delete countBytes;
     for(i in samplePool) oldSamplePool[i] = samplePool[i];
     for(i in samples) oldSamples[i] = samples[i];
    }



    This archive was generated by hypermail 2b29 : 05/03/02 EDT