Word indexing in a capture system
First Claim
Patent Images
1. A method comprising:
- receiving a query for objects captured by a capture system, the query including at least one search term;
hashing the at least one search term to a term bit position using a hash function; and
eliminating at least one object from the query using a word index associated with the object, the word index comprising a plurality of bits, wherein a bit in the term bit position is not set.
13 Assignments
0 Petitions
Accused Products
Abstract
Searching of objects captured by a capture system can be improved by eliminating irrelevant objects from a query. In one embodiment, the present invention includes receiving such a query for objects captured by a capture system, the query including at least one search term. This search term is then hashed to a term bit position using a hash function. Then objects can be eliminated if, in a word index associated with the object, the term bit position is not set.
194 Citations
36 Claims
-
1. A method comprising:
-
receiving a query for objects captured by a capture system, the query including at least one search term;
hashing the at least one search term to a term bit position using a hash function; and
eliminating at least one object from the query using a word index associated with the object, the word index comprising a plurality of bits, wherein a bit in the term bit position is not set. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method performed by a capture system, the method comprising:
-
capturing an object being transmitted over a network;
extracting text contained in the captured object;
generating a plurality of tokens form the extracted text;
hashing the plurality of tokens to a plurality of bit position values; and
creating a word index to be associated with the captured object by setting bits of a bit vector corresponding to the plurality of bit position values. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A capture system comprising:
-
a user interface configured to;
receive a query for objects captured by the capture system, the query including at least one search term;
hash the at least one search term to a term bit position using a hash function; and
eliminate at least one object from the query using a word index associated with the object, the word index comprising a plurality of bits, wherein a bit in the term bit position is not set. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A capture system:
-
one or more capture modules to capture an object being transmitted over a network;
a text extractor to extract text contained in the captured object;
a tokenizer to generate a plurality of tokens form the extracted text; and
an index generator to create a word index by hashing the plurality of tokens to a plurality of bit position values, and setting bits of a bit vector corresponding to the plurality of bit position values. - View Dependent Claims (19, 20, 21, 22, 23, 24)
-
-
25. A machine-readable medium having stored thereon data representing instructions that, when executed by a processor, cause the processor to perform operations comprising:
-
receiving a query for objects captured by a capture system, the query including at least one search term;
hashing the at least one search term to a term bit position using a hash function; and
eliminating at least one object from the query using a word index associated with the object, the word index comprising a plurality of bits, wherein a bit in the term bit position is not set. - View Dependent Claims (26, 27, 28, 29)
-
-
30. A machine-readable medium having stored thereon data representing instructions that, when executed by a processor of a capture system, cause the processor to perform operations comprising:
-
extracting text contained in the captured object;
generating a plurality of tokens form the extracted text;
hashing the plurality of tokens to a plurality of bit position values; and
creating a word index to be associated with the captured object by setting bits of a bit vector corresponding to the plurality of bit position values. - View Dependent Claims (31, 32, 33, 34, 35, 36)
-
Specification