CLUSTERING EVENT DATA BY MULTIPLE TIME DIMENSIONS
First Claim
1. A method for processing log data, the method comprising:
- determining, by a computing device, a set of data chunks, each data chunk includes a set of events clustered according to a primary time dimension field of each event of the set of events;
for each data chunk of the set of data chunks, determining a metadata structure that comprises a range of the primary time dimension field of all of the events in the data chunk and a range of a secondary time dimension field of all of the events in the data chunk;
selecting a subset of the data chunks;
disassembling the subset of data chunks into a plurality of events; and
generating a data chunk including at least one event of the plurality of events, the event is clustered in the data chunk according to the secondary time dimension field of the at least one event.
8 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for processing log data are provided. A set of data chunks is determined. Each data chunk is associated with a set of events, which are grouped according to a primary time dimension field of each event of the set of events. A metadata structure is determined for each of the data chunks. The metadata structure includes comprises a range of the primary time dimension field of all of the events in the data chunk and a range of a secondary time dimension field of all of the events in the data chunk. A subset of the data chunks is selected. A data chunk associated with at least one event of the plurality of events is generated according to the secondary time dimension field of the at least one event.
29 Citations
15 Claims
-
1. A method for processing log data, the method comprising:
-
determining, by a computing device, a set of data chunks, each data chunk includes a set of events clustered according to a primary time dimension field of each event of the set of events; for each data chunk of the set of data chunks, determining a metadata structure that comprises a range of the primary time dimension field of all of the events in the data chunk and a range of a secondary time dimension field of all of the events in the data chunk; selecting a subset of the data chunks; disassembling the subset of data chunks into a plurality of events; and generating a data chunk including at least one event of the plurality of events, the event is clustered in the data chunk according to the secondary time dimension field of the at least one event. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for processing log data, comprising:
-
a receiving module to generate a set of data chunks, each data chunk includes a set of events clustered according to a primary time dimension field of each event of the set of events; a chunks table to maintain, for each data chunk of the set of data chunks, a metadata structure that comprises a range of the primary time dimension field of all of the events in the data chunk and a range of a secondary time dimension field of all of the events in the data chunk; a read-optimized store to store the set of data chunks; a write-optimized store; and a clustering module to select a subset of the data chunks, and generate a data chunk using events of the subset, wherein the events of the subset are grouped according to the secondary time dimension field.
-
-
15. A non-transitory computer-readable medium storing a plurality of instructions to control a data processor to process log data, the plurality of instructions comprising instructions that cause the data processor to:
-
determine a set of data chunks, each data chunk includes a set of events clustered according to a primary time dimension field of each event of the set of events; for each data chunk of the set of data chunks, determine a metadata structure that comprises a range of the primary time dimension field of all of the events in the data chunk and a range of a secondary time dimension field of all of the events in the data chunk; select a subset of the data chunks; disassemble the subset of data chunks into a plurality of events; and generate a data chunk including at least one event of the plurality of events clustered in the data chunk according to the secondary time dimension field of the at least one event.
-
Specification