Method and system for retrieving relevant documents from a database
First Claim
1. A method for ranking a plurality of documents on the basis of the similarity of each of the plurality of documents to a user-query, said method comprising the steps ofparsing the user-query, thereby generating a query-word and a distribution of the query-word in the user-query, assessing an importance of the query-word on the basis of the frequency with which the query-word occurs in an authoritative database having at-least-one-authoritative-document, the at-least-one-authoritative-document having at-least-one-authoritative-document-sentence, and the distribution of the query-word in the user-query, evaluating a similarity of the at-least-one-authoritative-document to the user-query on the basis of a distribution of the query-word in the at-least-one-authoritative-document, evaluating a similarity of a public document from a public database to the user-query on the basis of a distribution of the query-word in the public document, the public document having at-least-one-public-document-sentence, evaluating a similarity of the at-least-one-authoritative-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-authoritative-document-sentence, evaluating a similarity of the at-least-one-public-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-public-document-sentence, ranking the at-least-one-public document relative to the at-least-one-authoritative document on the basis of the similarity of the at-least-one-authoritative-document to the user-query, the similarity of the public document to the user-query, the similarity of the at-least-one-authoritative-document-sentence to the user-query, and the similarity of the at-least-one-public-document-sentence to the user-query.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for processing a search query uses the results of a search performed on a high quality, controlled database to assess the relevance of documents retrieved from a search of an uncontrolled public database having documents of highly variable quality. The method includes the steps of parsing the search query and then searching the authoritative database to generate authoritative database results. The search query is also used to search the public database, thereby generating public database results. The quality or relevance of the public database results are then quantified on the basis of the authoritative database results, thereby generating a quality index. The results from both the authoritative and the public databases are then ranked on the basis of this quality index.
-
Citations
20 Claims
-
1. A method for ranking a plurality of documents on the basis of the similarity of each of the plurality of documents to a user-query, said method comprising the steps of
parsing the user-query, thereby generating a query-word and a distribution of the query-word in the user-query, assessing an importance of the query-word on the basis of the frequency with which the query-word occurs in an authoritative database having at-least-one-authoritative-document, the at-least-one-authoritative-document having at-least-one-authoritative-document-sentence, and the distribution of the query-word in the user-query, evaluating a similarity of the at-least-one-authoritative-document to the user-query on the basis of a distribution of the query-word in the at-least-one-authoritative-document, evaluating a similarity of a public document from a public database to the user-query on the basis of a distribution of the query-word in the public document, the public document having at-least-one-public-document-sentence, evaluating a similarity of the at-least-one-authoritative-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-authoritative-document-sentence, evaluating a similarity of the at-least-one-public-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-public-document-sentence, ranking the at-least-one-public document relative to the at-least-one-authoritative document on the basis of the similarity of the at-least-one-authoritative-document to the user-query, the similarity of the public document to the user-query, the similarity of the at-least-one-authoritative-document-sentence to the user-query, and the similarity of the at-least-one-public-document-sentence to the user-query.
-
11. A computer-readable medium containing software for ranking a plurality of documents on the basis of the similarity of each of the plurality of documents to a user-query, the software comprising instructions for executing the steps of
parsing the user-query, thereby generating a query-word and a distribution of the query-word in the user-query, assessing an importance of the query-word on the basis of the frequency with which the query-word occurs in an authoritative database having at-least-one-authoritative-document, the at-least-one-authoritative-document having at-least-one-authoritative-document-sentence, and the distribution of the query-word in the user-query, evaluating the similarity of the at-least-one-authoritative-document to the user-query on the basis of a distribution of the query-word in the at-least-one-authoritative-document, evaluating the similarity of a public document from a public database to the user-query on the basis of a distribution of the query-word in the public document, the public document having at-least-one-public-document-sentence, evaluating the similarity of the at-least-one-authoritative-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-authoritative-document-sentence, evaluating the similarity of the at-least-one-public-document-sentence to the user-query on the basis of the frequency with which the query-word occurs in the at-least-one-public-document-sentence, ranking the at-least-one-public document relative to the at-least-one-authoritative document on the basis of the similarity of the at-least-one-authoritative-document to the user-query, the similarity of the public document to the user-query, the similarity of the at-least-one-authoritative-document-sentence to the user-query, and the similarity of the at-least-one-public-document-sentence to the user-query.
Specification