Contextual searching by determining intersections of search results
First Claim
1. A method of contextual searching, comprising:
- a) searching a set of documents according to a first criterion to obtain a first set of results;
b) defining a subset of the set of documents according to a parameter to obtain a second set of results; and
c) defining a third set of results comprising an intersection of the first set of results and the second set of results, by, for each document in the first set of results, identifying whether the document exists in the second set of results;
wherein identifying whether the document exists comprises;
c.1.1) applying a primary hash function to an identifier for the document to obtain a primary hash key for the identifier;
c.1.2) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results;
c.1.3) applying a secondary hash function to the identifier to obtain a secondary hash key for the identifier;
c.1.4) comparing the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket;
c.1.5) responsive to c.1.4) indicating at least one match, retrieving a record in the second set of results having a location corresponding to the value in the matching hash entry; and
c.1.6) comparing the identifier with the retrieved record.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for rapidly identifying the existence and location of an item in a file using an improved hash table architecture. A hash table is constructed having a plurality of hash buckets, each identified by a primary hash key. Each hash entry in each hash bucket contains a pointer to a record in a master file, as well as a secondary hash key independent of the primary hash key. A search for a particular item is performed by identifying the appropriate hash bucket by obtaining a primary hash key for the search term. Individual hash entries within the hash bucket are checked for matches by comparing the stored secondary keys with the secondary key for the search term. Potentially matching records can be identified or ruled out without necessitating repeated reads of the master file. The improved hash table system and method is employed in a contextual text searching application for determining the intersection of a text search with a hierarchical categorization scheme.
77 Citations
13 Claims
-
1. A method of contextual searching, comprising:
-
a) searching a set of documents according to a first criterion to obtain a first set of results;
b) defining a subset of the set of documents according to a parameter to obtain a second set of results; and
c) defining a third set of results comprising an intersection of the first set of results and the second set of results, by, for each document in the first set of results, identifying whether the document exists in the second set of results;
wherein identifying whether the document exists comprises;
c.1.1) applying a primary hash function to an identifier for the document to obtain a primary hash key for the identifier;
c.1.2) identifying a hash bucket having a primary hash key corresponding to the obtained primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results;
c.1.3) applying a secondary hash function to the identifier to obtain a secondary hash key for the identifier;
c.1.4) comparing the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket;
c.1.5) responsive to c.1.4) indicating at least one match, retrieving a record in the second set of results having a location corresponding to the value in the matching hash entry; and
c.1.6) comparing the identifier with the retrieved record. - View Dependent Claims (10, 11)
a) comprises performing a text search on the set of documents; and
b) comprises defining the subset of the set of documents according to a specified category.
-
-
11. The method of claim 1, wherein c) comprises, for each document in the first set of results:
c.1) identifying whether the document exists in the second set of results.
-
2. A system of contextual searching, comprising:
-
a search engine for searching a set of documents according to a first criterion, to obtain a first set of results;
a category lookup engine for defining a subset of the set of documents according to a parameter, to obtain a second set of results; and
an intersection engine for defining a third set of results comprising an intersection of the first set of results and the second set of results, wherein the intersection engine comprises a document identifier for, for each document in the first set of results, identifying whether the document exists in the second set of results.
-
-
3. A system of contextual searching, comprising:
-
a search engine for searching a set of documents according to a first criterion, to obtain a first set of results;
a category lookup engine for defining a subset of the set of documents according to a parameter, to obtain a second set of results; and
an intersection engine for defining a third set of results comprising an intersection of the first set of results and the second set of results, the intersection engine comprising a document identifier for, for each document in the first set of results, identifying whether the document exists in the second set of results, the document identifier comprising;
a primary hash function application module, for applying a primary hash function to an identifier for the document to obtain a primary hash key for the identifier;
a hash bucket identification module, coupled to the primary hash function application module, for identifying a hash bucket having a primary hash key corresponding to the obtain primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results;
a secondary hash function application module, for applying a secondary hash function to the identifier to obtain a secondary hash key for the identifier;
a comparator, coupled to the secondary hash function application module and to the hash bucket identification module, for comparing the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket;
a retrieval module, coupled to the comparison means, for, responsive to the comparison module indicating at least one match, retrieving a record in the second set of results having a location corresponding to the value in the matching hash entry; and
a second comparator, coupled to the retrieval module, for comparing the identifier with the retrieved record. - View Dependent Claims (9)
the first criterion is a text string; and
the parameter is a subject category.
-
-
4. A system of contextual searching, comprising:
-
search means, for searching a set of documents according to a first criterion to obtain a first set of results;
subset definition means, coupled to the search means, for defining a subset of the set of documents according to a parameter to obtain a second set of results; and
result definition means, coupled to the search means and to the subset definition means, for defining a third set of results comprising an intersection of the first set of results and the second set of results. - View Dependent Claims (5, 6)
the search means comprises means for performing a text search on the set of documents; and
the subset definition means comprises means for defining the subset of the set of documents according to a specified category.
-
-
6. The system of claim 4, wherein the result definition means comprises:
identifying means for, for each document in the first set of results, identifying whether the document exists in the second set of results.
-
7. A system of contextual searching, comprising:
-
search means, for searching a set of documents according to a first criterion to obtain a first set of results;
subset definition means, coupled to the search means, for defining a subset of the set of documents according to a parameter to obtain a second set of results; and
result definition means, coupled to the search means and to the subset definition means, for defining a third set of results comprising an intersection of the first set of results and the second set of results, the result definition means comprising identifying means for, for each document in the first set of results, identifying whether the document exists in the second set of results, the identifying means comprising;
primary hash function application means, for applying a primary hash function to an identifier for the document to obtain a primary hash key for the identifier;
hash bucket identification means, coupled to the primary hash function application means, for identifying a hash bucket having a primary hash key corresponding to the obtain primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results;
secondary hash function application means, for applying a secondary hash function to the identifier to obtain a secondary hash key for the identifier;
comparison means, coupled to the secondary hash function application means and to the hash bucket identification means, for comparing the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket;
retrieval means, coupled to the comparison means, for, responsive to the comparison means indicating at least one match, retrieving a record in the second set of results having a location corresponding to the value in the matching hash entry; and
second comparison means, coupled to the retrieval means, for comparing the identifier with the retrieved record.
-
-
8. A computer program product comprising a computer-usable medium having computer-readable code embodied therein for contextual searching, comprising:
-
computer-readable program code configured to cause a computer to search a set of documents according to a first criterion to obtain a first set of results;
computer-readable program code configured to cause a computer to define a subset of the set of documents according to a parameter to obtain a second set of results; and
computer-readable program code configured to cause a computer to define a third set of results comprising an intersection of the first set of results and the second set of results, the computer-readable program code configured to cause a computer to define a third set of results comprising computer-readable program code configured to cause a computer to, for each document in the first set of results, identifying whether the document exists in the second set of results by, the computer-readable program code configured to cause a computer to identify whether the document exists in the second set of results comprising;
computer-readable program code configured to cause a computer to apply a primary hash function to an identifier for the document to obtain a primary hash key for the identifier;
computer-readable program code configured to cause a computer to identify a hash bucket having a primary hash key corresponding to the obtain primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results;
computer-readable program code configured to cause a computer to apply a secondary hash function to the identifier to obtain a secondary hash key for the identifier;
computer-readable program code configured to cause a computer to compare the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket;
computer-readable program code configured to cause a computer, responsive to the computer-readable program code configured to cause a computer to compare the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket indicating at least one match, to retrieve a record in the second set of results having a location corresponding to the value in the matching hash entry; and
computer-readable program code configured to cause a computer to compare the identifier with the retrieved record. - View Dependent Claims (12, 13)
the computer-readable program code configured to cause a computer to search a set of documents comprises computer-readable program code configured to cause a computer to perform a text search on the set of documents; and
the computer-readable program code configured to cause a computer to define a subset of the set of documents comprises computer-readable program code configured to cause a computer to define the subset of the set of documents according to a specified category.
-
-
13. The computer program product of claim 8, wherein the computer-readable program code configured to cause a computer to define a third set of results comprises computer-readable program code configured to cause a computer to, for each document in the first set of results:
identify whether the document exists in the second set of results.
Specification