High performance big data computing system and platform
First Claim
1. A computing system for accelerating large data transfer, the system including:
- a first processor;
a dimension store coupled to the first processor and storing input data elements, the dimension store comprising a key-value store and a column store, wherein the key-value store stores the input data elements in a key-value data format, and wherein the column store stores the input data elements in a columnar data format;
a second processor;
a model data store coupled to the second processor and storing model data related to the input data elements;
a third processor coupled to the dimension store and the model data store, wherein the third processor is configured to, in response to receiving an analytics query;
determine that the key-value store can serve a first portion of the analytics query with better performance than the column store, and that the column store can serve a second portion of the analytics query with better performance than the key-value store;
in response to the determination, selectively retrieve, based on the first portion of the analytics query and second portion of the analytics query, a first portion of the input data elements from the key-value store and a second portion of the input data elements from the column store;
join the first portion of the input data elements and the second portion of the input data elements to generate a set of input data elements;
retrieve, based on the analytics query, portions of the model data; and
generate a set of analytics results based on the set of input data elements and the portions of the model data and store the set of analytics results in an analytics results store.
1 Assignment
0 Petitions
Accused Products
Abstract
A computing system and platform uses various types of data stores to allow efficient querying of, and accelerated access to, extremely large data sets. One such data store is a dimension store that combines key-value and columnar stores, access to which is provided by several selectable mechanisms chosen based the nature of the data of interest. These include bitmap-based access, us of an optimized columnar data format, and access via namespace identifiers. A compressed, optimized page data format is provided for storing and analyzing large fact-based data. The complex dimension store is used to provide complex relationships and interpretation of the fact-based data, enabling high-performance advanced queries, with bitmap indexes passed between the two stores. Dimension data is stored in an encrypted manner throughout the system, and can be exchanged among parties in a secure manner.
18 Citations
20 Claims
-
1. A computing system for accelerating large data transfer, the system including:
-
a first processor; a dimension store coupled to the first processor and storing input data elements, the dimension store comprising a key-value store and a column store, wherein the key-value store stores the input data elements in a key-value data format, and wherein the column store stores the input data elements in a columnar data format; a second processor; a model data store coupled to the second processor and storing model data related to the input data elements; a third processor coupled to the dimension store and the model data store, wherein the third processor is configured to, in response to receiving an analytics query; determine that the key-value store can serve a first portion of the analytics query with better performance than the column store, and that the column store can serve a second portion of the analytics query with better performance than the key-value store; in response to the determination, selectively retrieve, based on the first portion of the analytics query and second portion of the analytics query, a first portion of the input data elements from the key-value store and a second portion of the input data elements from the column store; join the first portion of the input data elements and the second portion of the input data elements to generate a set of input data elements; retrieve, based on the analytics query, portions of the model data; and generate a set of analytics results based on the set of input data elements and the portions of the model data and store the set of analytics results in an analytics results store. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method of moving data from first and second portions of a computing system to a third portion of a computing system, comprising:
-
receiving an analytics query and, based on the analytics query, generating a characteristic relating to the first and second portions of the computing system; determining that the first portion of the computing system can serve a first portion of the analytics query with better performance than the second portion of the computing system, and that the second portion of the computing system can serve a second portion of the analytics query with better performance than the first portion of the computing system; in response to the determining, retrieving a first subset of data from the first portion of the computing system and a second subset of data from the second portion of the computing system, wherein the first portion comprises a key-value data store, wherein the second portion comprises a column data store, wherein the key-value data store and the column data store each store input data elements, wherein the key-value data store stores the input data elements in a key-value data format, and wherein the column data store stores input data elements in a columnar data format; transferring the first subset of data and the second subset of data from the first and second portions to the third portion based on the query and on the characteristic. - View Dependent Claims (19, 20)
-
Specification