Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text
First Claim
1. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:
- (a) searching a text database collection in a computer using a first search query of natural language to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document;
(b) ranking each of the selected small pieces of text into a first ranked list of relevant documents;
(c) applying feedback information based on a manual determination of the relevancy of each of the selected small pieces of text in the first ranked list to automatically create a second search query, the second search query being different than the first search query;
(d) repeating steps (a) to (b) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text, and the second group is different than the first group.
2 Assignments
0 Petitions
Accused Products
Abstract
Search system and method for retrieving relevant documents from a text data base collection comprised of patents, medical and legal documents, journals, news stories and the like. Each small piece of text within the documents such as a sentence, phrase and semantic unit in the data base is treated as a document. Natural language queries are used to search for relevant documents from the data base. A first search query creates a selected group of documents. Each word in both the search query and in the documents are given weighted values. Combining the weighted values creates similarity values for each document which are then ranked according to their relevant importance to the search query. A user reading and passing through this ranked list checks off which documents are relevant or not. Then the system automatically causes the original search query to be updated into a second search query which can include the same words, less words or different words than the first search query. Words in the second search query can have the same or different weights compared to the first search query. The system automatically searches the text data base and creates a second group of documents, which as a minimum does not include at least one of the documents found in the first group. The second group can also be comprised of additional documents not found in the first group. The ranking of documents in the second group is different than the first ranking such that the more relevant documents are found closer to the top of the list.
234 Citations
10 Claims
-
1. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:
-
(a) searching a text database collection in a computer using a first search query of natural language to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document; (b) ranking each of the selected small pieces of text into a first ranked list of relevant documents; (c) applying feedback information based on a manual determination of the relevancy of each of the selected small pieces of text in the first ranked list to automatically create a second search query, the second search query being different than the first search query; (d) repeating steps (a) to (b) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text, and the second group is different than the first group. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning, comprising the steps of:
-
(a) searching a text database collection in a computer using a first search query to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document; (b) semantically weighting the selected small pieces of text to form document weighted values for each of the selected small pieces of text in the first group; (c) semantically weighting the first search query to form query weighted values; (d) combining the query weighted values and the document weighted values to form similarity values for each of the selected small pieces of text; (e) ranking the similarity values for each of the selected small pieces of text to form a first ranked list; (f) automatically updating the first search query into a second search query based on feedback information on whether documents in the first ranked list are relevant, (g) repeating steps (a) to (e) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text which is different than the first group. - View Dependent Claims (9)
-
-
10. A method for retrieving relevant text from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:
-
(a) searching a text database collection in a computer using a first search query to retrieve a first group of selected text; (b) ranking each of the selected text to form a first ranked list; (c) determining relevancy of each of the selected text with a manual pass-through of the first ranked list; and (d) automatically updating the first search query based on the relevancy determination of the manual pass-through into a second search query, the second search query being different than the first search query; and (e) searching the text database collection using the second search query to retrieve a second group of selected text being different than the first group.
-
Specification