System for searching internet using automatic relevance feedback
First Claim
1. A method of retrieving computerized documents from a document database in a computer memory comprising the steps of:
- retrieving documents according to a first search statement;
developing signatures for a plurality of documents by searching for words in the documents and removing common words which occur in a relatively high frequency in a natural language in which the documents are written;
automatically associating selected ones of the documents with a first document according to a degree of match of the signatures of the selected documents with the signature of the first document;
displaying the first document;
responsive to a user indication that a second search is to be made, automatically selecting a set of words from an aggregate signature based on the signatures of the first and associated documents and the aggregate number of occurrences of the set of words in the first and associated documents; and
constructing a second search statement from the selected set of words.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of retrieving documents from a document database is disclosed. A set of documents is retrieved according to a first search statement. A signature for a first retrieved document, and preferably other documents by searching for words in the first document and removing common words which occur in a relatively high frequency in a natural language in which the first document is written. The document for which the signature was developed is displayed. Responsive to a user indication that a second search is to be made, deriving a second search statement from the signature of the document.
In the preferred embodiment, a "spectrum" of documents is prepared and presented to the user. The signatures of a plurality of documents from the documents retrieved according to the first search statement by searching for words in the documents and removing common words which occur in a relatively high frequency in a natural language in which the documents are written. The spectrum of documents is selected so that the document signatures differ by at least a predetermined amount. The user may select more than one document to derive further search statement. Responsive to determining that the user has selected a plurality of documents in the spectrum of displayed documents, the second search statement is derived from the signatures of the selected documents.
167 Citations
18 Claims
-
1. A method of retrieving computerized documents from a document database in a computer memory comprising the steps of:
-
retrieving documents according to a first search statement; developing signatures for a plurality of documents by searching for words in the documents and removing common words which occur in a relatively high frequency in a natural language in which the documents are written; automatically associating selected ones of the documents with a first document according to a degree of match of the signatures of the selected documents with the signature of the first document; displaying the first document; responsive to a user indication that a second search is to be made, automatically selecting a set of words from an aggregate signature based on the signatures of the first and associated documents and the aggregate number of occurrences of the set of words in the first and associated documents; and constructing a second search statement from the selected set of words. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system including processor, memory and display for retrieving computerized documents from a document database comprising:
-
means for retrieving documents according to a first search statement; means for developing signatures for a plurality of documents by searching for words in the documents and removing common words which occur in a relatively high frequency in a natural language in which the documents are written; means for automatically associating selected ones of the documents with a first document according to a degree of match of the signatures of the selected documents with the signature of the first document; means for displaying the first document; means responsive to a user indication that a second search is to be made for automatically selecting a set of words from an aggregate signature based on the signatures of the first and associated documents and the aggregate number of occurrences of the set of words in the first and associated documents; and means for constructing a second search statement from the selected set of words. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer program product in a computer readable medium for retrieving documents from a document database comprising:
-
means for retrieving documents according to a first search statement; means for developing signatures for a plurality of documents by searching for words in the documents and removing common words which occur in a relatively high frequency in a natural language in which the documents are written; means for associating selected ones of the documents with a first document according to a degree of match of the signatures of the selected documents with the signature of the first document; means for displaying the first document; means responsive to a user indication that a second search is to be made for automatically selecting a set of words from an aggregate signature based on the signatures of the first and associated documents and the aggregate number of occurrences of the set of words in the first and associated documents; and means for constructing a second search statement from the selected set of words. - View Dependent Claims (15, 16, 17, 18)
-
Specification