Systems and methods for document searching
First Claim
1. A computer-implemented method for keyword searching, the method comprising:
- receiving a plurality of noisy keywords for a document collection;
generating, by a server, tokens for a plurality of keywords in the document collection;
merging the tokens to create an index;
receiving a search query, wherein the search query includes at least one search phrase;
receiving, for the at least one search phrase, an indication from a user specifying to perform one of a noisy phrase search or a noiseless phrase search; and
searching the index for the at least one search phrase based on the indication received from the user.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for document searching. In one implementation, a computer-implemented method provides keyword searching. The method may receive a plurality of noisy keywords for a document collection. A server may generate tokens for a plurality of keywords in the document collection and merge the tokens to create an index. A search query may be received. The search query may include at least one search phrase. For the at least one search phrase, an indication may be received from a user specifying to perform one of a noisy phrase search or a noiseless phrase search. The method may search the index for the at least one search phrase based on the indication received from the user.
38 Citations
18 Claims
-
1. A computer-implemented method for keyword searching, the method comprising:
-
receiving a plurality of noisy keywords for a document collection; generating, by a server, tokens for a plurality of keywords in the document collection; merging the tokens to create an index; receiving a search query, wherein the search query includes at least one search phrase; receiving, for the at least one search phrase, an indication from a user specifying to perform one of a noisy phrase search or a noiseless phrase search; and searching the index for the at least one search phrase based on the indication received from the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for creating an index for a document collection, the system comprising:
-
a database storing a plurality of noisy keywords for a document collection; and a computer-readable storage medium storing instructions for; comparing a keyword in one or more documents to the plurality of noisy keywords; determining whether the keyword is a noisy keyword; determining, for one of the documents, a keyword position and a number of noisy keywords preceding the keyword; creating, for the one of the documents, a token including a document identifier, an indication whether the keyword is a noisy keyword, the keyword position, and the number of noisy keywords preceding the keyword, and storing the token in an index. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-implemented method for tokenizing keywords in a keyword search, the method comprising:
-
receiving, from a database, a plurality of noisy keywords for a document collection; determining whether a keyword in a search query is a noisy keyword by comparing the keyword to the plurality of noisy keywords; tagging the keyword with a noisy keyword tag when the keyword is found in the noisy keyword list; determining a keyword position and a number of noisy keywords preceding the keyword; and outputting a token, wherein the token comprises a document identifier, the noisy keyword tag, the keyword position, and the number of noisy keywords preceding the keyword. - View Dependent Claims (18)
-
Specification