System and method for rapidly identifying the existence and location of an item in a file
First Claim
1. A system for rapidly locating an item known to exist in a file, comprising:
- a master file for storing a plurality of items;
a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a pointer to an item in the master file, for identifying the location of the item; and
a secondary hash key obtained by applying a secondary hash function;
a search term input device for obtaining a search term known to exist among the plurality of items;
a primary hash function application module coupled to the search term input device, for applying a primary hash function to obtain a primary hash key for the search term;
a secondary hash function application module coupled to the search term input device, for applying the secondary hash function to obtain a secondary hash key for the search term;
a hash key comparison module coupled to the hash table, for comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket, wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term; and
an output device coupled to the hash key comparison module, for, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the master file corresponding to the matching secondary hash key.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for rapidly identifying the existence and location of an item in a file using an improved hash table architecture. A hash table is constructed having a plurality of hash buckets, each identified by a primary hash key. Each hash entry in each hash bucket contains a pointer to a record in a master file, as well as a secondary hash key independent of the primary hash key. A search for a particular item is performed by identifying the appropriate hash bucket by obtaining a primary hash key for the search term. Individual hash entries within the hash bucket are checked for matches by comparing the stored secondary keys with the secondary key for the search term. Potentially matching records can be identified or ruled out without necessitating repeated reads of the master file. The improved hash table system and method is employed in a contextual text searching application for determining the intersection of a text search with a hierarchical categorization scheme.
116 Citations
76 Claims
-
1. A system for rapidly locating an item known to exist in a file, comprising:
-
a master file for storing a plurality of items;
a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a pointer to an item in the master file, for identifying the location of the item; and
a secondary hash key obtained by applying a secondary hash function;
a search term input device for obtaining a search term known to exist among the plurality of items;
a primary hash function application module coupled to the search term input device, for applying a primary hash function to obtain a primary hash key for the search term;
a secondary hash function application module coupled to the search term input device, for applying the secondary hash function to obtain a secondary hash key for the search term;
a hash key comparison module coupled to the hash table, for comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket, wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term; and
an output device coupled to the hash key comparison module, for, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the master file corresponding to the matching secondary hash key. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of locating an item known to exist in a file, the file having a plurality of items, comprising:
-
a) applying a primary hash function to obtain a primary hash key for a search term known to exist among the plurality of items;
b) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a value and a secondary hash key;
c) applying a secondary hash function to obtain a secondary hash key for the search term;
d) comparing the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket; and
e) responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the file corresponding to the matching secondary hash key. - View Dependent Claims (8)
-
-
9. A system for rapidly locating an item known to exist in a file, comprising:
-
master file storage means for storing a plurality of items; and
hash table means, coupled to the master file storage means, for storing a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
pointing means for pointing to an item in the master file, for identifying the location of the item; and
storing means for storing a secondary hash key obtained by applying a secondary hash function;
search term input means for obtaining a search term known to exist among the plurality of items;
primary hash function application means coupled to the search term input means, for applying a primary hash function to obtain a primary hash key for the search term;
secondary hash function application means coupled to the search term input means, for applying the secondary hash function to obtain a secondary hash key for the search term;
hash key comparison means coupled to the hash table means, for comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket, wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term; and
output means coupled to the hash key comparison means, for, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the master file corresponding to the matching secondary hash key. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system for locating an item known to exist in a file, the file having a plurality of items, comprising:
-
primary hash function means, for applying a primary hash function to obtain a primary hash key for a search term known to exist among the plurality of items;
hash bucket identification means, coupled to the primary hash function means, for identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a value and a secondary hash key;
secondary hash function means, for applying a secondary hash function to obtain a secondary hash key for the search term;
comparing means, coupled to the secondary hash function means and to the hash bucket identification means, for comparing the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket; and
output means, coupled to the comparing means, for, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the file corresponding to the matching secondary hash key.
-
-
16. A computer-implemented method for locating an item known to exist in a file, comprising:
-
a) accessing a master file for storing a plurality of items;
b) obtaining a search term known to exist among the plurality of items;
c) applying a primary hash function to obtain a primary hash key for the search term;
d) applying a secondary hash function to obtain a secondary hash key for the search term;
e) accessing a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a pointer to an item in the master file, for identifying the location of the item; and
a secondary hash key obtained by applying the secondary hash function;
f) comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket, wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term; and
g) responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generating output identifying an item in the master file corresponding to the matching secondary hash key. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for locating an item known to exist in a file, comprising:
-
computer-readable program code devices configured to cause a computer to access a master file for storing a plurality of items;
computer-readable program code devices configured to cause a computer to obtain a search term known to exist among the plurality of items;
computer-readable program code devices configured to cause a computer to apply a primary hash function to obtain a primary hash key for the search term;
computer-readable program code devices configured to cause a computer to apply a secondary hash function to obtain a secondary hash key for the search term;
computer-readable program code devices configured to cause a computer to access a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a pointer to an item in the master file, for identifying the location of the item; and
a secondary hash key obtained by applying the secondary hash function;
computer-readable program code devices configured to cause a computer to compare the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket, wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term; and
computer-readable program code devices configured to cause a computer to, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generate output identifying an item in the master file corresponding to the matching secondary hash key. - View Dependent Claims (23, 24, 25, 26, 27)
-
-
28. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for locating an item known to exist in a file, the file having a plurality of items, comprising:
-
computer-readable program code devices configured to cause a computer to apply a primary hash function to obtain a primary hash key for a search term known to exist among the plurality of items;
computer-readable program code devices configured to cause a computer to identify a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a value and a secondary hash key;
computer-readable program code devices configured to cause a computer to apply a secondary hash function to obtain a secondary hash key for the search term;
computer-readable program code devices configured to cause a computer to compare the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket; and
computer-readable program code devices configured to cause a computer to, responsive to the comparison of secondary hash keys resulting in a single match in the identified hash bucket, generate output identifying an item in the file corresponding to the matching secondary hash key.
-
-
29. A system for rapidly locating a data item, comprising:
-
a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a data item; and
a secondary hash key obtained by applying a secondary hash function;
a search term input device for obtaining a search term;
a primary hash function application module coupled to the search term input device, for applying a primary hash function to obtain a primary hash key for the search term;
a secondary hash function application module coupled to the search term input device, for applying the secondary hash function to obtain a secondary hash key for the search term; and
a hash key comparison module coupled to the hash table, for comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket;
wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term. - View Dependent Claims (30)
-
-
31. A method of locating a data item, comprising:
-
a) applying a primary hash function to obtain a primary hash key for a search term;
b) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a data item and a secondary hash key;
c) applying a secondary hash function to obtain a secondary hash key for the search term; and
d) comparing the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket. - View Dependent Claims (32)
-
-
33. A system for locating a data item, comprising:
-
hash table means for storing a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
first storing means for storing a data item; and
second storing means for storing a secondary hash key obtained by applying a secondary hash function;
search term input means for obtaining a search term;
primary hash function application means coupled to the search term input means, for applying a primary hash function to obtain a primary hash key for the search term;
secondary hash function application means coupled to the search term input means, for applying the secondary hash function to obtain a secondary hash key for the search term; and
hash key comparison means coupled to the hash table means, for comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket;
wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term. - View Dependent Claims (34)
-
-
35. A computer-implemented method for rapidly identifying the existence of an item in a file, comprising:
-
a) accessing a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a data item; and
a secondary hash key obtained by applying a secondary hash function;
b) obtaining a search term;
c) applying a primary hash function to obtain a primary hash key for the search term;
d) applying the secondary hash function to obtain a secondary hash key for the search term; and
e) comparing the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket;
wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term. - View Dependent Claims (36)
-
-
37. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for locating a data item, comprising:
-
computer-readable program code devices configured to cause a computer to access a hash table, comprising a plurality of hash buckets, each bash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising;
a data item; and
a secondary hash key obtained by applying a secondary hash function;
computer-readable program code devices configured to cause a computer to obtain a search term;
computer-readable program code devices configured to cause a computer to apply a primary hash function to obtain a primary hash key for the search term;
computer-readable program code devices configured to cause a computer to apply the secondary hash function to obtain a secondary hash key for the search term; and
computer-readable program code devices configured to cause a computer to compare the secondary hash key for the search term with at least one secondary hash key for a hash entry in an identified hash bucket;
wherein the identified hash bucket is identified by a primary hash key matching the obtained primary hash key for the search term. - View Dependent Claims (38)
-
-
39. A method of adding an item to a hash table, comprising:
-
a) applying a primary hash function to the item to obtain a primary hash key;
b) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key;
c) creating a hash entry in the identified bucket;
d) applying a secondary hash function to the item to obtain a secondary hash key; and
e) forming a hash entry record from the secondary hash key; and
f) writing the hash entry record in the identified bucket. - View Dependent Claims (40, 41, 42, 43, 44, 45)
-
-
46. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for adding an item to a hash table, comprising:
-
computer-readable program code devices configured to cause a computer to apply a primary hash function to the item to obtain a primary hash key;
computer-readable program code devices configured to cause a computer to identify a hash bucket having a primary hash key corresponding to the obtained primary hash key;
computer-readable program code devices configured to cause a computer to create a hash entry in the identified bucket;
computer-readable program code devices configured to cause a computer to apply a secondary hash function to the item to obtain a secondary hash key; and
computer-readable program code devices configured to cause a computer to form a hash entry record from the secondary hash key; and
computer-readable program code devices configured to cause a computer to write the hash entry record in the identified bucket. - View Dependent Claims (47, 48, 49, 50, 51, 52)
-
-
53. A system for rapidly identifying the existence of an item in a file, comprising:
-
a master file for storing a plurality of items; and
a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising a pointer to an item in the master file, for identifying the location of the item; and
for each hash entry, a secondary hash key obtained by applying a secondary hash function. - View Dependent Claims (54, 55, 56, 57)
-
-
58. A method of identifying the existence of an item in a file, comprising:
-
a) applying a primary hash function to obtain a primary hash key for a search term;
b) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a value and being associated with a secondary hash key;
c) applying a secondary hash function to obtain a secondary hash key for the search term; and
d) comparing the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket. - View Dependent Claims (59, 60, 61)
-
-
62. A system for rapidly identifying the existence of an item in a file, comprising:
-
master file storage means for storing a plurality of items; and
hash table means, coupled to the master file storage means, for storing a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising pointing means for pointing to an item in the master file, for identifying the location of the item; and
storing means for storing, for each hash bucket, a secondary hash key obtained by applying a secondary hash function. - View Dependent Claims (63, 64, 65)
-
-
66. A computer-implemented method for rapidly identifying the existence of an item in a file, comprising:
-
a) accessing a master file for storing a plurality of items; and
b) accessing a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising a pointer to an item in the master file, for identifying the location of the item and each hash entry being associated with a secondary hash key obtained by applying a secondary hash function. - View Dependent Claims (67, 68, 69)
-
-
70. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for rapidly identifying the existence of an item in a file, comprising:
-
computer-readable program code devices configured to cause a computer to access a master file for storing a plurality of items; and
computer-readable program code devices configured to cause a computer to access a hash table, comprising a plurality of hash buckets, each hash bucket identified by a primary hash key, each hash bucket comprising at least one hash entry, each hash entry comprising a pointer to an item in the master file, for identifying the location of the item and each hash entry being associated with a secondary hash key obtained by applying a secondary hash function. - View Dependent Claims (71, 72, 73)
-
-
74. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for identifying the existence of an item in a file, comprising:
-
computer-readable program code devices configured to cause a computer to apply a primary hash function to obtain a primary hash key for a search term;
computer-readable program code devices configured to cause a computer to identify a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a value and being associated with a secondary hash key;
computer-readable program code devices configured to cause a computer to apply a secondary hash function to obtain a secondary hash key for the search term; and
computer-readable program code devices configured to cause a computer to compare the secondary hash key for the search term with the secondary hash key for at least one hash entry in the identified hash bucket. - View Dependent Claims (75, 76)
-
Specification