Text search system
First Claim
1. A method implemented in a data processing apparatus for retrieving from among more than one library document those matching the content of a sequence of query words, comprising the steps of:
- (a) defining a set of equivalent words for each of the query words and assigning a word equivalence value to each of said equivalent words; and
(b) computing a relevance factor for a library document, comprising the steps of;
(i) locating target sequences of words in the library document that match the sequence of query words, and equivalence thereof, according to a set of matching criteria;
(ii) evaluating similarity values of said target sequences of words, each similarity value being evaluated as a function of the equivalence values of words included in the corresponding target sequence; and
(iii) said relevance factor being computed as a function of the similarity values of its target sequences.
1 Assignment
0 Petitions
Accused Products
Abstract
In a searching for library documents that match the content of a given sequence of query words, a set of equivalent words are defined for each query word along with a corresponding word equivalence value assigned to each equivalent word. Target sequences of words in a library document which match the sequence of query words are located according to a set of matching criteria. The similarity value of each target sequence is evaluated as a function of the corresponding equivalence values of words included therein. Based upon the similarity values of its target sequences, a relevance factor is then obtained for each library document.
284 Citations
20 Claims
-
1. A method implemented in a data processing apparatus for retrieving from among more than one library document those matching the content of a sequence of query words, comprising the steps of:
-
(a) defining a set of equivalent words for each of the query words and assigning a word equivalence value to each of said equivalent words; and (b) computing a relevance factor for a library document, comprising the steps of; (i) locating target sequences of words in the library document that match the sequence of query words, and equivalence thereof, according to a set of matching criteria; (ii) evaluating similarity values of said target sequences of words, each similarity value being evaluated as a function of the equivalence values of words included in the corresponding target sequence; and (iii) said relevance factor being computed as a function of the similarity values of its target sequences. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. In a document retrieval system storing more than one library document, an apparatus for retrieving library documents matching the content of a sequence of query words, comprising:
-
(a) means for storing a set of equivalent words for each of the query words, each of said equivalent words being stored with a corresponding word equivalence value; (b) means coupled to said storage means for computing a relevance factor for a library document, comprising; (i) first means for receiving a set of matching criteria; (ii) second means coupled to said first means for locating target sequences of words in a library document that match the sequence of query words, and equivalence thereof, according to said matching criteria; (iii) third means coupled to said second means for evaluating similarity values of said target sequences of words, each similarity value being evaluated as a function of the equivalence values of words included in the corresponding target sequence; and (iv) fourth means receiving said similarity values for computing said relevance factor. - View Dependent Claims (10, 12, 13, 14, 15, 16)
-
-
11. The apparatus as in 9 wherein said matching criteria include the ordering of a sequence of words in the library document with respect to the sequence of query words.
-
17. A method implemented in a data processing apparatus for retrieving from among more than one library document those matching the content of a sequence of query words, comprising the steps of:
-
defining a set of equivalent words for each query words and each of said equivalent words being assigned a word equivalence value; and computing a relevance factor for a library document, comprising the steps of; (a) locating target sequences of words in the library document that match the specified sequence of query words according to a set of matching criteria, said matching criteria comprising; (i) the ordering of a sequence of words in the library document with respect to the specified sequence of query words; (ii) the completeness of a match between a sequence of words in the library document and the specified sequence of query words; and (iii) the span of a sequence of words in the library document that matches the specified sequence of query words; (b) evaluating similarity values of said target sequences of words, each similarity value being evaluated as a function of the equivalence values of words included in the corresponding target sequence; and (c) said relevance factor being computed as a function of the similarity values of its target sequences. - View Dependent Claims (18, 19, 20)
-
Specification