Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text

US 5,893,092 A
Filed: 06/23/1997
Issued: 04/06/1999
Est. Priority Date: 12/06/1994
Status: Expired due to Term

First Claim

Patent Images

1. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:

(a) searching a text database collection in a computer using a first search query of natural language to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document;

(b) ranking each of the selected small pieces of text into a first ranked list of relevant documents;

(c) applying feedback information based on a manual determination of the relevancy of each of the selected small pieces of text in the first ranked list to automatically create a second search query, the second search query being different than the first search query;

(d) repeating steps (a) to (b) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text, and the second group is different than the first group.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Search system and method for retrieving relevant documents from a text data base collection comprised of patents, medical and legal documents, journals, news stories and the like. Each small piece of text within the documents such as a sentence, phrase and semantic unit in the data base is treated as a document. Natural language queries are used to search for relevant documents from the data base. A first search query creates a selected group of documents. Each word in both the search query and in the documents are given weighted values. Combining the weighted values creates similarity values for each document which are then ranked according to their relevant importance to the search query. A user reading and passing through this ranked list checks off which documents are relevant or not. Then the system automatically causes the original search query to be updated into a second search query which can include the same words, less words or different words than the first search query. Words in the second search query can have the same or different weights compared to the first search query. The system automatically searches the text data base and creates a second group of documents, which as a minimum does not include at least one of the documents found in the first group. The second group can also be comprised of additional documents not found in the first group. The ranking of documents in the second group is different than the first ranking such that the more relevant documents are found closer to the top of the list.

234 Citations

10 Claims

1. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:
- (a) searching a text database collection in a computer using a first search query of natural language to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document;
  
  (b) ranking each of the selected small pieces of text into a first ranked list of relevant documents;
  
  (c) applying feedback information based on a manual determination of the relevancy of each of the selected small pieces of text in the first ranked list to automatically create a second search query, the second search query being different than the first search query;
  
  (d) repeating steps (a) to (b) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text, and the second group is different than the first group.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method for retrieving relevant text data of claim 1, wherein each of the small pieces of text includes at least one of:
    - a sentence, a phrase, and a semantic unit.
  - 3. The method for retrieving relevant text data of claim 1, wherein the second search query includes:
    - at least one less word from the first search query.
  - 4. The method for retrieving relevant text data of claim 1, wherein the second search query includes:
    - at least one additional word to the first search query.
  - 5. The method for retrieving relevant text data of claim 1, wherein the second group includes:
    - at least one less document that had been listed in the first group.
  - 6. The method for retrieving relevant text data of claim 1, wherein the second group includes:
    - at least one additional document that was not found in the first group.
  - 7. The method for retrieving relevant text data of claim 1, wherein the second ranked list includes:
    - a different ranked order of documents than the first ranked list.

8. A method for retrieving relevant text data from a text database collection in a computer without annotating, parsing or pruning, comprising the steps of:
- (a) searching a text database collection in a computer using a first search query to retrieve a first group of selected small pieces of text, where each of the selected small pieces of text corresponds to a document;
  
  (b) semantically weighting the selected small pieces of text to form document weighted values for each of the selected small pieces of text in the first group;
  
  (c) semantically weighting the first search query to form query weighted values;
  
  (d) combining the query weighted values and the document weighted values to form similarity values for each of the selected small pieces of text;
  
  (e) ranking the similarity values for each of the selected small pieces of text to form a first ranked list;
  
  (f) automatically updating the first search query into a second search query based on feedback information on whether documents in the first ranked list are relevant,(g) repeating steps (a) to (e) to form a second ranked list, wherein the second ranked list includes a second group of selected small pieces of text which is different than the first group.
- View Dependent Claims (9)
- - 9. The method for retrieving relevant text data of claim 8, wherein each of the small pieces of text includes at least one of:
    - a sentence, a phrase, and a semantic unit.

10. A method for retrieving relevant text from a text database collection in a computer without annotating, parsing or pruning the text database collection, comprising the steps of:
- (a) searching a text database collection in a computer using a first search query to retrieve a first group of selected text;
  
  (b) ranking each of the selected text to form a first ranked list;
  
  (c) determining relevancy of each of the selected text with a manual pass-through of the first ranked list; and
  
  (d) automatically updating the first search query based on the relevancy determination of the manual pass-through into a second search query, the second search query being different than the first search query; and
  
  (e) searching the text database collection using the second search query to retrieve a second group of selected text being different than the first group.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
University of Central Florida Research Foundation Inc. (State University System of Florida)
Original Assignee
University of Central Florida (State University System of Florida)
Inventors
Driscoll, James R.
Primary Examiner(s)
Black, Thomas G.
Assistant Examiner(s)
Homere, Jean Raymond

Application Number

US08/880,807
Time in Patent Office

652 Days
Field of Search

707/5, 707/3, 707/500, 704/9
US Class Current

1/1
CPC Class Codes

G06F 16/3322   using system suggestions G0...

G06F 16/3329   Natural language query form...

G06F 16/3334   Selection or weighting of t...

G06F 16/3344   using natural language anal...

G06F 16/3346   using probabilistic model

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

Y10S 707/99936   Pattern matching access

Y10S 707/99939   Privileged access

Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

234 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

234 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links