Systems and methods for data conversion and comparison
First Claim
1. A database system comprising:
- at least one processor configured to execute a plurality of system components, wherein the system components comprise;
a translation component configured to;
translate input data in a first format into a canonical format;
analyze original data elements in the first format to determine a data type associated with respective data elements;
map each individual data element of the input data to a canonical data type associated with the determined data type;
encode each individual data element into a byte stream comprising at least;
a canonical type byte based on the mapping; and
at least one data value for data of the data element where present; and
a database manager configured to;
receive requests for database operations from client systems and respond to the requests; and
execute data comparison operations against the canonical format byte stream to respond to at least some of the requests for database operations.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one embodiment, a translation component is configured to operate on document encoded data to translate the document encoded data into a canonical format comprising a plurality of canonical types that fold together into a byte stream. The translation component is configured to accept any storage format of data (e.g., column store, row store, LSM tree, etc. and/or data from any storage engine, WIREDTIGER, MMAP, AR tree, Radix tree, etc.) and translate that data into a byte stream to enable efficient comparison. When executing searches and using the translated data to provide comparisons there is necessarily a trade-off based on the cost of translating the data and how much the translated data can be leveraged to increase comparison efficiency.
-
Citations
20 Claims
-
1. A database system comprising:
-
at least one processor configured to execute a plurality of system components, wherein the system components comprise; a translation component configured to; translate input data in a first format into a canonical format; analyze original data elements in the first format to determine a data type associated with respective data elements; map each individual data element of the input data to a canonical data type associated with the determined data type; encode each individual data element into a byte stream comprising at least; a canonical type byte based on the mapping; and at least one data value for data of the data element where present; and a database manager configured to; receive requests for database operations from client systems and respond to the requests; and execute data comparison operations against the canonical format byte stream to respond to at least some of the requests for database operations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 13, 14, 15, 16, 17, 18)
-
-
10. A computer implemented method for managing a distributed database, the method comprising:
-
translating, by at least one processor input data in a first format into a canonical format; analyzing, by the at least one processor, original data elements in the first format to determine a data type associated with respective data elements; mapping, by the at least one processor, each individual data element of the input data to a canonical data type associated with the determined data type; encoding, by the at least one processor, each individual data element into a byte stream comprising at least;
a canonical type byte based on the mapping, and at least one data value for data of the data element where present; andreceiving, by the at least one processor, requests for database operations from client systems and responding to the requests; and executing, by the at least one processor, data comparison operations against the canonical format byte stream to respond to at least some of the requests for database operations. - View Dependent Claims (11, 12)
-
-
19. A computer-readable medium having instructions thereon for causing a processor to execute the instructions, the instructions adapted to be executed to implement a method for managing a distributed database, the method comprising:
-
translating input data in a first format into a canonical format; analyzing original data elements in the first format to determine a data type associated with respective data elements; mapping each individual data element of the input data to a canonical data type associated with the determined data type; encoding each individual data element into a byte stream comprising at least;
a canonical type byte based on the mapping, and at least one data value for data of the data element where present; andreceiving requests for database operations from client systems and responding to the requests; and executing data comparison operations against the canonical format byte stream to respond to at least some of the requests for database operations. - View Dependent Claims (20)
-
Specification