Systems and methods for document searching
First Claim
1. A system for creating an index for a document collection, the system comprising:
- at least one processor; and
a computer readable storage medium storing instructions that, when executed by the processor, causes the processor to;
compare a keyword in one or more documents to a plurality of noisy keywords associated with a document collection;
determine whether the keyword is a noisy keyword;
determine, for one of the documents, a keyword position and a number of noisy keywords preceding the keyword;
create, for the one of the documents, a token including a document identifier, an indication whether the keyword is a noisy keyword, the keyword position, and the number of noisy keywords preceding the keyword; and
store the token in an index.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods are provided for document searching. In one implementation, a computer-implemented method provides keyword searching. The method may receive a plurality of noisy keywords for a document collection. A server may generate tokens for a plurality of keywords in the document collection and merge the tokens to create an index. A search query may be received. The search query may include at least one search phrase. For the at least one search phrase, an indication may be received from a user specifying to perform one of a noisy phrase search or a noiseless phrase search. The method may search the index for the at least one search phrase based on the indication received from the user.
39 Citations
10 Claims
-
1. A system for creating an index for a document collection, the system comprising:
-
at least one processor; and a computer readable storage medium storing instructions that, when executed by the processor, causes the processor to; compare a keyword in one or more documents to a plurality of noisy keywords associated with a document collection; determine whether the keyword is a noisy keyword; determine, for one of the documents, a keyword position and a number of noisy keywords preceding the keyword; create, for the one of the documents, a token including a document identifier, an indication whether the keyword is a noisy keyword, the keyword position, and the number of noisy keywords preceding the keyword; and store the token in an index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification