Virtual split dictionary for search optimization
First Claim
Patent Images
1. A computer implemented method comprising:
- generating a dictionary including a plurality of value identifiers mapped to a plurality of attribute values, respectively, wherein the plurality of attribute values are identified from data records in a data structure;
modifying the data records in the data structure by replacing an attribute value of each data record with a corresponding value identifier included in the dictionary, wherein the modification comprises mapping at least two data records that have a same attribute value to a same value identifier;
partitioning the modified data structure into a plurality of split data structures and rearranging data records among the split data structures based on the value identifiers such that each rearranged split data structure stores data records having a mutually exclusive subset of value identifiers;
in response to a query being received for an attribute, identifying a value identifier mapped to the attribute in the dictionary, identifying a rearranged split data structure storing the identified value identifier from among the plurality of rearranged split data structures, and executing the query on data records in the identified rearranged split data structure to generate search results; and
outputting information for display on a display device based on the search results.
2 Assignments
0 Petitions
Accused Products
Abstract
An attribute vector including value identifiers and corresponding to a dictionary structure is identified. A dictionary type encoding structure is generated by virtually partitioning the dictionary structure. The dictionary type encoding structure may include multiple dictionary types. Based on the dictionary encoding structure, the attribute vector may be split to generate multiple attribute vector blocks that may be identified by block transition indices. Based on the dictionary types in the dictionary encoding structure, the value identifiers in the attribute vector blocks are rearranged. Such a rearrangement optimizes the attribute vector for searching the value identifiers.
-
Citations
17 Claims
-
1. A computer implemented method comprising:
-
generating a dictionary including a plurality of value identifiers mapped to a plurality of attribute values, respectively, wherein the plurality of attribute values are identified from data records in a data structure; modifying the data records in the data structure by replacing an attribute value of each data record with a corresponding value identifier included in the dictionary, wherein the modification comprises mapping at least two data records that have a same attribute value to a same value identifier; partitioning the modified data structure into a plurality of split data structures and rearranging data records among the split data structures based on the value identifiers such that each rearranged split data structure stores data records having a mutually exclusive subset of value identifiers; in response to a query being received for an attribute, identifying a value identifier mapped to the attribute in the dictionary, identifying a rearranged split data structure storing the identified value identifier from among the plurality of rearranged split data structures, and executing the query on data records in the identified rearranged split data structure to generate search results; and outputting information for display on a display device based on the search results. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system comprising:
-
a processor; and one or more memory devices communicatively coupled with the processor and the one or more memory devices storing instructions to; generate a dictionary including a plurality of value identifiers mapped to a plurality of attribute values, respectively, wherein the plurality of attribute values are identified from data records in a data structure; modify the data records in the data structure by replacing an attribute value of each data record with a corresponding value identifier included in the dictionary, wherein the modification comprises mapping at least two data records that have a same attribute value to a same value identifier; partition the modified data structure into a plurality of split data structures and rearrange data records among the split data structures based on the value identifiers such that each rearranged split data structure stores data records having a mutually exclusive subset of value identifiers; in response to a query being received for an attribute, identify a value identifier mapped to the attribute in the dictionary, identify a rearranged split data structure storing the identified value identifier from among the plurality of rearranged split data structures, and execute the query on data records in the identified rearranged split data structure to generate search results; and output information for display on a display device based on the search results. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable storage medium tangibly storing instructions, which when executed by a computer, cause the computer to execute operations comprising:
-
generate a dictionary including a plurality of value identifiers mapped to a plurality of attribute values, respectively, wherein the plurality of attribute values are identified from data records in a data structure; modify the data records in the data structure by replacing an attribute value of each data record with a corresponding value identifier included in the dictionary, wherein the modification comprises mapping at least two data records that have a same attribute value to a same value identifier; partitioning the modified data structure into a plurality of split data structures and rearranging data records among the split data structures based on the value identifiers such that each rearranged split data structure stores data records having a mutually exclusive subset of value identifiers; in response to a query being received for an attribute, identifying a value identifier mapped to the attribute in the dictionary, identifying a rearranged split data structure storing the identified value identifier from among the plurality of rearranged split data structures, and executing the query on data records in the identified rearranged split data structure to generate search results; and outputting information for display on a display device based on the search results. - View Dependent Claims (14, 15, 16, 17)
-
Specification