STORING LOG DATA EFFICIENTLY WHILE SUPPORTING QUERYING
First Claim
1. A method for processing log data, comprising:
- receiving log data that comprises a plurality of events, wherein an event includes a set of fields, and wherein a field stores a value; and
for each event in the plurality of events;
storing the event in a set of buffers, wherein each field of the event is associated with a different buffer; and
updating a metadata structure that comprises information about contents of the buffers,wherein information about contents of the buffers includes a first minimum value that reflects a minimum value of a first field of all of the events stored in the buffers.
11 Assignments
0 Petitions
Accused Products
Abstract
A logging system includes an event receiver and a storage manager. The receiver receives log data, processes it, and outputs a column-based data “chunk.” The manager receives and stores chunks. The receiver includes buffers that store events and a metadata structure that stores metadata about the contents of the buffers. Each buffer is associated with a particular event field and includes values from that field from one or more events. The metadata includes, for each “field of interest,” a minimum value and a maximum value that reflect the range of values of that field over all of the events in the buffers. A chunk is generated for each buffer and includes the metadata structure and a compressed version of the buffer contents. The metadata structure acts as a search index when querying event data. The logging system can be used in conjunction with a security information/event management (SIEM) system.
204 Citations
31 Claims
-
1. A method for processing log data, comprising:
-
receiving log data that comprises a plurality of events, wherein an event includes a set of fields, and wherein a field stores a value; and for each event in the plurality of events; storing the event in a set of buffers, wherein each field of the event is associated with a different buffer; and updating a metadata structure that comprises information about contents of the buffers, wherein information about contents of the buffers includes a first minimum value that reflects a minimum value of a first field of all of the events stored in the buffers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer program product for processing log data, the computer program product comprising a machine-readable storage medium containing computer program code for performing a method, the method comprising:
-
receiving log data that comprises a plurality of events, wherein an event includes a set of fields, and wherein a field stores a value; and for each event in the plurality of events; storing the event in a set of buffers, wherein each field of the event is associated with a different buffer; and updating a metadata structure that comprises information about contents of the buffers, wherein information about contents of the buffers includes a first minimum value that reflects a minimum value of a first field of all of the events stored in the buffers.
-
-
22. A system for processing log data, comprising:
-
a machine-readable storage medium containing computer program code for performing a method, the method comprising; receiving log data that comprises a plurality of events, wherein an event includes a set of fields, and wherein a field stores a value; and for each event in the plurality of events; storing the event in a set of buffers, wherein each field of the event is associated with a different buffer; and updating a metadata structure that comprises information about contents of the buffers, wherein information about contents of the buffers includes a first minimum value that reflects a minimum value of a first field of all of the events stored in the buffers; and a processor configured to execute the computer program code stored by the machine-readable medium.
-
-
23. A method for processing events, wherein an event includes multiple fields, and wherein a field stores a value, comprising:
-
receiving a set of events; generating a row-based chunk that includes the set of events and metadata about the set of events; and generating a column-based chunk that includes metadata about the set of events and, for each event in the set of events, a value of a particular field, wherein the metadata about the set of events includes a first minimum value that reflects a minimum value of a first field over all of the events in the set of events. - View Dependent Claims (24, 25)
-
-
26. A method for processing events, wherein an event includes multiple fields, and wherein a field stores a value, comprising:
-
receiving a first set of events; generating a row-based chunk that includes the first set of events and metadata about the first set of events, wherein the metadata about the first set of events includes a first minimum value that reflects a minimum value of a first field over all of the events in the first set of events; receiving a second set of events; and generating a column-based chunk that includes metadata about the second set of events and, for each event in the second set of events, a value of a particular field, wherein the metadata about the second set of events includes a second minimum value that reflects a minimum value of a second field over all of the events in the second set of events. - View Dependent Claims (27, 28, 29)
-
-
30. A method for searching a set of events according to a search query, wherein an event includes multiple fields, and wherein a field stores a value, and wherein the search query indicates a desired value and one field of the multiple fields, the method comprising:
-
accessing a first column-based chunk that includes, for each event in the set of events, a value of the indicated field and an associated index location identifier; identifying a value in the first column-based chunk that matches the desired value; identifying the index location identifier associated with the identified value; accessing a second column-based chunk that includes, for each event in the set of events, a table location identifier and an associated index location identifier; identifying the table location identifier in the second column-based chunk that is associated with the identified index location identifier; accessing a row-based chunk that includes each event in the set of events; and identifying the event in the row-based chunk that is associated with the identified table location identifier.
-
-
31. A method for selecting an execution strategy for a query, comprising:
-
estimating a selectivity of the query'"'"'s predicates; determining a number of fields involved in the query; responsive to the selectivity being low and the number of columns being low, selecting a column-only strategy; and responsive to the selectivity being high and the number of columns being high, selecting a row-and-column strategy.
-
Specification