Associative text search and retrieval system
First Claim
1. An associative text search and retrieval system, comprising:
- front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system;
storage means for storing a plurality of text documents; and
processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms.
5 Assignments
0 Petitions
Accused Products
Abstract
An associative text search and retrieval system uses one or more front end processors to interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system. The system also includes storage for a plurality of text documents, and at least one processor, coupled to the front end processors and the document storage. The processor(s) search the text documents according to a search request provided by the user and provide to the front end processor a predetermined number of retrieved documents containing at least one term of the search request. The retrieved documents have higher ranks than documents not provided to the front end processor. The ranks are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms.
-
Citations
47 Claims
-
1. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a plurality of text documents; and processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a plurality of text documents; and processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents and that contain at least one of the search terms, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the ranks are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms and according to an inverse document frequency of each of the search terms, wherein the formula is;
##EQU2## wherein nt represents a total number of search terms, ut represents a number of unique search terms that occur in a particular one of the text documents, tfi represents a number of times search term i occurs in the text document, oc represents a percentage of occurrences of search terms in a floating text window containing a maximum number of search terms and is calculated by dividing a count of occurrences of search terms in the window by a total number of occurrences of search terms in the document and then multiplying the result by one hundred, dfi is a count of the text documents that contain term i, maxdfi is a maximum number of the text documents in which any of the search terms, and all logs are in base two. - View Dependent Claims (18, 19)
-
-
20. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; means for allowing the user to enter mandatory terms which must be present in each of the retrieved documents; storage means for storing a plurality of text documents; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user separately from the mandatory terms, if any, for calculating a score for each of the text documents containing the mandatory search terms, if any, and at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms.
-
-
21. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a document collection containing a plurality of text documents and a list of frequently used terms for the document collection, the list of frequently used terms being dynamic, based upon a variety of functional factors including the frequency of occurrence of a term in the document collection and the nature of the document collection; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; an index, associated with the text documents, for indicating the locations of potential search terms within the text documents; means for excluding noise terms from being considered for the search by not including noise terms in the index; and means for excluding frequently used terms from being considered for the search, the frequently used terms being contained in the index and being excluded from the search by not using terms in the list for the search.
-
-
22. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a plurality of text documents; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; and means for providing the user with a screen indicating for each retrieved document which search terms are present in which retrieved documents. - View Dependent Claims (23, 24, 25)
-
-
26. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a plurality of text documents; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means; and means for providing the user with a screen indicating a term importance for each of the search terms wherein the term importance varies according to inverse document frequency of the search term, wherein the term importance varies according to log(maxdfi/dfi), wherein the log is to the base two, dfi is a count of the retrieved documents that contain search term i, and maxdfi is a maximum number of the retrieved documents in which any of the search terms appear.
-
-
27. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing at least one document collection containing a plurality of text documents and predetermined information indicating how the documents in the document collection can be presented; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; and means for allowing the user to select one of many possible orders for presenting the retrieved documents based on the predetermined information contained in the document collection.
-
-
28. An associative text search and retrieval system, comprising:
-
a front end processor connected to a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; a session administrator (SA) computer, connected to the front end processor, containing a software program that prompts the user to provide input to the system, formulates a search request based on input provided by the user, and provides the user with retrieved text documents; and a search and retrieval (SR) computer, coupled to the SA computer, having storage for storing a plurality of text documents, and having a software program for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the SA computer a predetermined number of retrieved documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the SA computer, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms. - View Dependent Claims (29, 30)
-
-
31. An associative text search and retrieval system, comprising:
-
a front end processor connected to a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; a session administrator (SA) computer, connected to the front end processor, containing a software program that prompts the user to provide input to the system, formulates a search request based on input provided by the user, and provides the user with retrieved text documents; and a search and retrieval (SR) computer, coupled to the SA computer, having storage for storing a plurality of text documents, and having a software program for performing a search of the text documents using a plurality of search terms provided by the user and for providing to the SA computer a predetermined number of retrieved documents containing at least one of the search terms, the retrieved documents having higher ranks than text documents not provided to the SA computer, wherein the ranks are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms and according to an inverse document frequency of each of the search terms, wherein the formula is;
##EQU3## wherein nt represents a total number of search terms, ut represents a number of unique search terms that occur in a particular one of the text documents, tfi represents a number of times search term i occurs in the text document, oc represents a percentage of occurrences of search terms in a floating text window containing a maximum number of search terms and is calculated by dividing a count of occurrences of search terms in the window by a total number of occurrences of search terms in the document and then multiplying the result by one hundred, dfi is a count of the text documents that contain term i, maxdfi is a maximum number of the text documents in which any of the search terms, and all logs are in base two. - View Dependent Claims (32, 33)
-
-
34. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; calculating a score for each of the text documents containing at least one of the search terms using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; and providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the ranks of the documents, the retrieved documents having higher ranks than text documents not provided. - View Dependent Claims (35, 36, 37, 40)
-
-
38. A method of operating an associative text search and retrieval system, A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; and providing the user with a predetermined number of retrieved documents that are a subset of the text documents and that contain at least one of the search terms, the retrieved documents having higher ranks than text documents not provided, wherein the ranks are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms and according to an inverse document frequency of each of the search terms, wherein the formula is;
##EQU4## wherein nt represents a total number of search terms, ut represents a number of unique search terms that occur in a particular one of the text documents, tfi represents a number of times search term i occurs in the text document, oc represents a percentage of occurrences of search terms in a floating text window containing a maximum number of search terms and is calculated by dividing a count of occurrences of search terms in the window by a total number of occurrences of search terms in the document and then multiplying the result by one hundred, dfi is a count of the text documents that contain term i, maxdfi is a maximum number of the text documents in which any of the search terms, and all logs are in base two.
-
-
39. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; allowing the user to enter mandatory terms which must be present in each of the retrieved documents, separately from the search terms provided by the user; calculating a score for each of the text documents containing the mandatory terms, if any, and at least one of the search terms, using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; and providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the user.
-
-
41. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents contained in a document collection using a plurality of search terms provided by a user; indicating locations of potential search terms within the text documents using an index which is associated with the text documents; excluding noise terms from being considered for the search by not including noise terms in the index; excluding from being considered for the search frequently used terms contained in the index and in the document collection in a list of frequently used terms, the list of frequently used terms being dynamic, based upon a variety of functional factors including the frequency of occurrence of a term in the document collection and the nature of the document collection, the frequently used terms being excluded from the search by not using terms in the list for the search; calculating a score for each of the text documents containing at least one of the search terms except for noise terms and frequently used terms excluded in said excluding step, using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; and providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks.
-
-
42. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; calculating a score for each of the text documents containing at least one of the search terms, using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks; and indicating for each retrieved document which search terms are present in which retrieved documents. - View Dependent Claims (43)
-
-
44. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; calculating a score for each of the text documents containing at least one of the search terms except for noise terms and frequently used terms excluded in said excluding step; ranking the text documents based on their scores; providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks; and displaying in eye-readable form a term importance for each of the search terms wherein the term importance varies according to log(maxdfi/dfi), wherein the log is to the base two, dfi is a count of the retrieved documents that contain search term i, and maxdfi is a maximum number of the retrieved documents in which any of the search terms appear.
-
-
45. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents contained in a document collection using a plurality of search terms provided by a user; calculating a score for each of the text documents containing at least one of the search terms, using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks; and allowing the user to select one of many possible orders for presenting the retrieved documents based on predetermined information contained in the document collection indicating how the documents in the document collection can be presented.
-
-
46. An associative text search and retrieval system, comprising:
-
front end processing means for interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system; storage means for storing a plurality of text documents; processor means, coupled to the front end processing means and the storage means, for performing a search of the text documents using a plurality of search terms provided by the user, for calculating a score for each of the text documents containing at least one of the search terms, for ranking the text documents based on their scores, and for providing to the front end processing means a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks, the retrieved documents having higher ranks than text documents not provided to the front end processing means, wherein the scores are calculated using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; means for providing the user with a screen indicating a term importance for each of the search terms wherein the term importance varies according to inverse document frequency of the search term.
-
-
47. A method of operating an associative text search and retrieval system, comprising the steps of:
-
performing a search of text documents using a plurality of search terms provided by a user; calculating a score for each of the text documents containing at least one of the search terms except for noise terms and frequently used terms excluded in said excluding steps, using a formula that varies according to the square of the frequency in each of the text documents of each of the search terms; ranking the text documents based on their scores; providing the user with a predetermined number of retrieved documents that are a subset of the text documents based on the documents'"'"' ranks; and indicating a term importance for each of the search terms wherein the term importance varies according to inverse document frequency of the search term.
-
Specification