Systems and methods for efficient data ingestion and query processing
First Claim
Patent Images
1. A system comprising:
- at least one processor; and
a memory storing instructions configured to instruct the at least one processor to perform;
serializing log entries associated with at least one logged event;
dividing the serialized log entries into one or more distributed chunks for storage in one or more leaf nodes of an in-memory data storage module, wherein storage of at least one of the distributed chunks is striped across at least two randomly selected leaf nodes, and wherein a corresponding space limit of each of the one or more leaf nodes is adjusted based at least in part on a type of data being stored at the leaf node;
providing a query to aggregators at hierarchical levels in the in-memory data storage module, wherein the aggregators are configured to pre-aggregate at least some data stored in the one or more leaf nodes of the in-memory data storage module in anticipation of the query;
providing the query to leaf nodes of the in-memory data storage module;
executing the query on the leaf nodes;
returning results of the query to the aggregators;
performing one or more aggregations on the results of the query; and
updating a query cache that corresponds to the query to include data describing the results.
2 Assignments
0 Petitions
Accused Products
Abstract
A query may be provided to aggregators at hierarchical levels in an in-memory data storage module. The query may be provided to leaf nodes of the in-memory data storage module. The leaf nodes may execute the query, returning results of the query to the aggregators. One or more aggregations may be performed based on the results. In an embodiment, log entries associated with a logged event may be serialized and divided into distributed chunks for storage in the leaf nodes. A leaf node, from the leaf nodes, having storage capacity for a distributed chunk may be identified. The distributed chunk may be stored in the leaf node.
-
Citations
18 Claims
-
1. A system comprising:
-
at least one processor; and a memory storing instructions configured to instruct the at least one processor to perform; serializing log entries associated with at least one logged event; dividing the serialized log entries into one or more distributed chunks for storage in one or more leaf nodes of an in-memory data storage module, wherein storage of at least one of the distributed chunks is striped across at least two randomly selected leaf nodes, and wherein a corresponding space limit of each of the one or more leaf nodes is adjusted based at least in part on a type of data being stored at the leaf node; providing a query to aggregators at hierarchical levels in the in-memory data storage module, wherein the aggregators are configured to pre-aggregate at least some data stored in the one or more leaf nodes of the in-memory data storage module in anticipation of the query; providing the query to leaf nodes of the in-memory data storage module; executing the query on the leaf nodes; returning results of the query to the aggregators; performing one or more aggregations on the results of the query; and updating a query cache that corresponds to the query to include data describing the results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer implemented method comprising:
-
serializing log entries associated with at least one logged event; dividing the serialized log entries into one or more distributed chunks for storage in one or more leaf nodes of an in-memory data storage module, wherein storage of at least one of the distributed chunks is striped across at least two randomly selected leaf nodes, and wherein a corresponding space limit of each of the one or more leaf nodes is adjusted based at least in part on a type of data being stored at the leaf node; providing, by a computer system, a query to aggregators at hierarchical levels in the in-memory data storage module, wherein the aggregators are configured to pre-aggregate at least some data stored in the one or more leaf nodes of the in-memory data storage module in anticipation of the query; providing, by the computer system, the query to leaf nodes of the in-memory data storage module; executing, by the computer system, the query on the leaf nodes; returning, by the computer system, results of the query to the aggregators; performing, by the computer system, one or more aggregations on the results of the query; and updating a query cache that corresponds to the query to include data describing the results.
-
-
18. A non-transitory computer storage medium storing computer-executable instructions that, when executed, cause a computer system to perform a computer-implemented method comprising:
-
serializing log entries associated with at least one logged event; dividing the serialized log entries into one or more distributed chunks for storage in one or more leaf nodes of an in-memory data storage module, wherein storage of at least one of the distributed chunks is striped across at least two randomly selected leaf nodes, and wherein a corresponding space limit of each of the one or more leaf nodes is adjusted based at least in part on a type of data being stored at the leaf node; providing a query to aggregators at hierarchical levels in the in-memory data storage module, wherein the aggregators are configured to pre-aggregate at least some data stored in the one or more leaf nodes of the in-memory data storage module in anticipation of the query; providing the query to leaf nodes of the in-memory data storage module; executing the query on the leaf nodes; returning results of the query to the aggregators; performing one or more aggregations on the results of the query and updating a query cache that corresponds to the query to include data describing the results.
-
Specification