Method, apparatus, and computer-readable medium for efficiently performing operations on distinct data values
First Claim
1. A method for efficiently performing operations on distinct data values by one or more computing devices, the method comprising:
- storing, by at least one of the one or more computing devices, a tokenized column of data in a table, the tokenized column of data created by mapping each unique data value in a domain of a database to an entity ID, and replacing each of a plurality of data values in a column of data corresponding to the domain with the corresponding entity ID to generate the column of tokenized data containing a plurality of entity IDs;
receiving, by at least one of the one or more computing devices, a query directed to the column of data, the query defining one or more group sets for grouping data retrieved in response to the query, wherein each group set in the one or more group sets corresponds to a unique group of one or more values associated with one or more other domains of the database; and
generating, by at least one of the one or more computing devices, an entity map vector for each group set in the one or more group sets by identifying any entity IDs in the tokenized column of data which are present in any rows of the table which include the unique group of one or more values corresponding to the group set, wherein the length of each entity map vector is equal to the total number of entity IDs in the domain and the value of each bit in each entity map vector indicates the presence or absence of a different entity ID in the corresponding group set.
8 Assignments
0 Petitions
Accused Products
Abstract
An apparatus, computer-readable medium, and computer-implemented method for efficiently performing operations on distinct data values, including storing a tokenized column of data in a table by mapping each unique data value in a corresponding domain to a unique entity ID, and replacing each of the data values in the column with the corresponding entity ID to generate a column of tokenized data containing one or more entity IDs, receiving a query directed to the column of data, the query defining one or more group sets for grouping the data retrieved in response to the query, and generating an entity map vector for each group set, the length of each entity map vector equal to the number of unique entity IDs for the domain, and the value of each bit in the entity map vector indicating the presence or absence of a different unique entity ID in the group set.
22 Citations
63 Claims
-
1. A method for efficiently performing operations on distinct data values by one or more computing devices, the method comprising:
-
storing, by at least one of the one or more computing devices, a tokenized column of data in a table, the tokenized column of data created by mapping each unique data value in a domain of a database to an entity ID, and replacing each of a plurality of data values in a column of data corresponding to the domain with the corresponding entity ID to generate the column of tokenized data containing a plurality of entity IDs; receiving, by at least one of the one or more computing devices, a query directed to the column of data, the query defining one or more group sets for grouping data retrieved in response to the query, wherein each group set in the one or more group sets corresponds to a unique group of one or more values associated with one or more other domains of the database; and generating, by at least one of the one or more computing devices, an entity map vector for each group set in the one or more group sets by identifying any entity IDs in the tokenized column of data which are present in any rows of the table which include the unique group of one or more values corresponding to the group set, wherein the length of each entity map vector is equal to the total number of entity IDs in the domain and the value of each bit in each entity map vector indicates the presence or absence of a different entity ID in the corresponding group set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. An apparatus for efficiently performing operations on distinct data values, the apparatus comprising:
-
one or more processors; and one or more memories operatively coupled to at least one of the one or more processors and having instructions stored thereon that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to; store a tokenized column of data in a table, the tokenized column of data created by mapping each unique data value in a domain of a database to an entity ID, and replacing each of a plurality of data values in a column of data corresponding to the domain with the corresponding entity ID to generate the column of tokenized data containing a plurality of entity IDs; receive a query directed to the column of data, the query defining one or more group sets for grouping data retrieved in response to the query, wherein each group set in the one or more group sets corresponds to a unique group of one or more values associated with one or more other domains of the database; and generate an entity map vector for each group set in the one or more group sets by identifying any entity IDs in the tokenized column of data which are present in any rows of the table which include the unique group of one or more values corresponding to the group set, wherein the length of each entity map vector is equal to the total number of entity IDs in the domain and the value of each bit in each entity map vector indicates the presence or absence of a different entity ID in the corresponding group set. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. At least one non-transitory computer-readable medium storing computer-readable instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to:
-
store a tokenized column of data in a table, the tokenized column of data created by mapping each unique data value in a domain of a database to an entity ID, and replacing each of a plurality of data values in a column of data corresponding to the domain with the corresponding entity ID to generate the column of tokenized data containing a plurality of entity IDs; receive a query directed to the column of data, the query defining one or more group sets for grouping data retrieved in response to the query, wherein each group set in the one or more group sets corresponds to a unique group of one or more values associated with one or more other domains of the database; and generate an entity map vector for each group set in the one or more group sets by identifying any entity IDs in the tokenized column of data which are present in any rows of the table which include the unique group of one or more values corresponding to the group set, wherein the length of each entity map vector is equal to the total number of entity IDs in the domain and the value of each bit in each entity map vector indicates the presence or absence of a different entity ID in the corresponding group set. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
-
Specification