×

System and method to sample a large data set of network traffic records

  • US 10,313,209 B2
  • Filed: 12/30/2016
  • Issued: 06/04/2019
  • Est. Priority Date: 12/30/2016
  • Status: Active Grant
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A computer-implemented method to sample a large data set of traffic records, the traffic records corresponding to network traffic flows associated with at least one particular address, the method comprising:

  • processing multiple iterations associated with respective traffic records of the large data set that satisfy particular criteria, processing an iteration of the multiple iterations comprising;

    receiving a traffic record from a source of a large data set of traffic records, the traffic record corresponding to a traffic flow and identifying a pair of addresses exchanging communications included in the traffic flow and including a traffic size value that indicates the size of communications included in the traffic flow;

    receiving a flow counter and a total traffic size, the flow counter representing the number of traffic flows received for one of the addresses of the pair identified, the number of traffic flows representing previously received traffic records associated with the address, the total traffic size representing a sum of traffic sizes associated with all previously received traffic records, the previously received traffic records having been received during previous iterations of the multiple iterations;

    incrementing the flow counter;

    adding the traffic size associated with the received traffic record to the total traffic size;

    if the flow counter is less than a predetermined sampling threshold, then storing a traffic record sample associated with the traffic record;

    if the flow counter is more than the predetermined sampling threshold, then determining whether or not to sample the received traffic record by applying an exponentially decreasing probability function; and

    storing the traffic record sample as sampled data associated with the traffic record only if the determination is to sample the received traffic record.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×