Efficient access to sparse packets in large repositories of stored network traffic
First Claim
1. A method comprising:
- capturing a first packet from a network;
annotating the first packet with a time stamp specifying an arrival time of the first packet;
s storing the first packet in a first data file of a set of data files organized at predetermined intervals, the first data file dedicated to a first predetermined interval based on the time stamp;
creating a first primary index for the first packet, the first primary index containing a path and an offset to the first packet stored in the first data file;
storing the first primary index for the first packet in a first primary index file associated with the first data file dedicated to the first predetermined interval; and
creating a secondary index for the first packet, the secondary index having an ordered sequence of present bits, wherein a first present bit corresponds to the first primary index and the first data file dedicated to the first predetermined interval, and wherein an asserted value of the first present bit indicates presence of a target value in the first packet stored in the first data file over a search time window.
5 Assignments
0 Petitions
Accused Products
Abstract
A secondary indexing technique cooperates with primary indices of an indexing arrangement to enable efficient storage and access of metadata used to retrieve packets persistently stored in data files of a data repository. Efficient storage and access of the metadata used to retrieve the persistently stored packets may be based on a target value of the packets over a search time window. The metadata is illustratively organized as a metadata repository of primary index files that store the primary indices containing hash values of network flows of the packets, as well as offsets and paths to those packets stored in the data files. The technique includes one or more secondary indices having a plurality of present bits arranged in a binary format (i.e., a bit array) to indicate the presence of the target value in one or more packets stored in the data files over the search time window. Notably, the present bits may be used to reduce (i.e., “prune”) a relatively large search space of the stored packets (e.g., defined by the hash values) to a pruned search space of only those data files in which packets having the target value are stored.
149 Citations
20 Claims
-
1. A method comprising:
-
capturing a first packet from a network; annotating the first packet with a time stamp specifying an arrival time of the first packet; s storing the first packet in a first data file of a set of data files organized at predetermined intervals, the first data file dedicated to a first predetermined interval based on the time stamp; creating a first primary index for the first packet, the first primary index containing a path and an offset to the first packet stored in the first data file; storing the first primary index for the first packet in a first primary index file associated with the first data file dedicated to the first predetermined interval; and creating a secondary index for the first packet, the secondary index having an ordered sequence of present bits, wherein a first present bit corresponds to the first primary index and the first data file dedicated to the first predetermined interval, and wherein an asserted value of the first present bit indicates presence of a target value in the first packet stored in the first data file over a search time window. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
2. The method of 1 wherein the target value is an element of a network flow.
-
3. The method of 2 wherein the element is one of an internet protocol address and a port number.
-
16. A system comprising:
-
one or more processors coupled to a network; a plurality of storage repositories coupled to the one or more processors, the storage repositories including a data repository having data files configured to store packets captured from the network and a metadata repository having primary and secondary index files configured to store primary and secondary indices, the primary indices having hash values along with paths and offsets to the captured packets stored in the data files, the hash values calculated from a hash function applied to network flows of the captured packets, the secondary indices having a plurality of present bits arranged to indicate presence of a target value in one or more of the captured packets stored in the data files over a search time window; and a memory coupled to the one or more processors and configured to store one or more processes of an operating system, the one or more processes executable by the one or more processors to use the present bits of the secondary indices to prune a search space of the captured packets as defined by the hash values to a pruned search space of only the data files of the data repository storing captured packets having the target value, the one or more processes further executable to use the paths and the offsets of the primary indices having the hash values defined by the pruned search space to retrieve the captured packets having the target value from the data repository. - View Dependent Claims (17, 18, 19)
-
-
20. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions when executed operable to:
-
capture a packet from a network; annotate the packet with a time stamp specifying an arrival time of the packet; store the packet in a data file of a set of data files organized at predetermined intervals, the data file dedicated to a predetermined interval based on the time stamp; create a primary index for the packet, the primary index containing a path and an offset to the packet stored in the data file; store the primary index for the packet in a primary index file associated with the data file dedicated to the predetermined interval; and create secondary index for the packet, the secondary index having an ordered sequence of present bits, wherein a present bit corresponds to the primary index and the data file dedicated to the predetermined interval, and wherein an asserted value of the present bit indicates presence of a target value in the packet stored in the data file over a search time window.
-
Specification