Enabling Faster Full-Text Searching Using a Structured Data Store
First Claim
1. A computer-implemented method for storing information in an entry within a structured data store, wherein the entry includes one or more base fields and one or more extended fields, comprising:
- receiving a string;
extracting information from the string;
storing the extracted information in the one or more base fields of the entry based on the meaning of the extracted information;
identifying a portion of the string that is to be enabled for faster searching;
parsing the identified portion of the string into a plurality of tokens; and
for each token in the plurality of tokens;
determining a hash value of the token based on a hashing scheme; and
storing the token in an extended field that corresponds to the determined hash value.
4 Assignments
0 Petitions
Accused Products
Abstract
A traditional structured data store is leveraged to provide the benefits of an unstructured full-text search system. A fixed number of “extended” columns is added to the traditional structured data store to form an “enhanced structured data store” (ESDS). The extended columns are independent of any regular columnar interpretation of the data and enable the data that they store to be searched using standard full-text query syntax/techniques that can be executed faster (as opposed to SQL syntax). In other words, the added columns act as a search index. A token is stored in an appropriate extended column based on that token'"'"'s hash value. The hash value is determined using a hashing scheme, which operates based on the value of the token, rather than the meaning of the token. This enables subsequent searches to be expressed as full-text queries without degrading the ensuing search to a brute force scan.
73 Citations
13 Claims
-
1. A computer-implemented method for storing information in an entry within a structured data store, wherein the entry includes one or more base fields and one or more extended fields, comprising:
-
receiving a string; extracting information from the string; storing the extracted information in the one or more base fields of the entry based on the meaning of the extracted information; identifying a portion of the string that is to be enabled for faster searching; parsing the identified portion of the string into a plurality of tokens; and for each token in the plurality of tokens; determining a hash value of the token based on a hashing scheme; and storing the token in an extended field that corresponds to the determined hash value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product for storing information in an entry within a structured data store, wherein the entry includes one or more base fields and one or more extended fields, and wherein the computer program product is stored on a computer-readable medium that includes instructions that, when loaded into memory, cause a processor to perform a method, the method comprising:
-
receiving a string; extracting information from the string; storing the extracted information in the one or more base fields of the entry based on the meaning of the extracted information; identifying a portion of the string that is to be enabled for faster searching; parsing the identified portion of the string into a plurality of tokens; and for each token in the plurality of tokens; determining a hash value of the token based on a hashing scheme; and storing the token in an extended field that corresponds to the determined hash value.
-
-
13. A system for storing information in an entry within a structured data store, wherein the entry includes one or more base fields and one or more extended fields, the system comprising:
-
a computer-readable medium that includes instructions that, when loaded into memory, cause a processor to perform a method, the method comprising; receiving a string; extracting information from the string; storing the extracted information in the one or more base fields of the entry based on the meaning of the extracted information; identifying a portion of the string that is to be enabled for faster searching; parsing the identified portion of the string into a plurality of tokens; and for each token in the plurality of tokens; determining a hash value of the token based on a hashing scheme; and storing the token in an extended field that corresponds to the determined hash value; and a processor for performing the method.
-
Specification