Document retrieval using fuzzy-logic inference
First Claim
1. In a closest document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
- means for converting the filed documents and query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search;
means for comparing the filed patterns with the query pattern using a fuzzy grade matching function, andmeans for ranking the matching functions of the filed documents to rank the closest document.
3 Assignments
0 Petitions
Accused Products
Abstract
The results of a full-text, document search by a character string search processor are treated as vector patterns whose elements become a term match grade by use of a membership function of the term match frequency. The closest pattern to the query pattern is found by the similarity between the query pattern and each of the filed sample patterns. The similarity is calculated by use of fuzzy-logic. The similarity is ranked in order of similarity magnitude, thereby reducing the search time. The search time can be shortened by categorizing the filed patterns by term set and similarity to a cluster center pattern. If the cluster center patterns are stored, the closest cluster address can be inferred by fuzzy logic inference from the match between the query document and the term set or the similarity of the query to the cluster center.
160 Citations
20 Claims
-
1. In a closest document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search; means for comparing the filed patterns with the query pattern using a fuzzy grade matching function, and means for ranking the matching functions of the filed documents to rank the closest document.
-
-
2. In a closest document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search; means for comparing the filed patterns with the query pattern using a fuzzy grade matching function, where said means for comparing comprises a pattern similarity calculator, and means for ranking the matching functions of the filed documents to rank the closest document where said means for ranking comprises a ranker FIFO. - View Dependent Claims (3, 4, 5, 6)
-
-
7. In a closest document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and the query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search; means for categorizing hierarchically said filed patterns into categories based on similarity between said filed pattern and a category center pattern; means for classifying hierarchically said query pattern using fuzzy logic inference based on similarity between said query pattern and a category center; and means for ranking the closest filed document to the query document based on inferenced data fidelity. - View Dependent Claims (8)
-
-
9. In a closest document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and the query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search; means for categorizing hierarchically said filed patterns into categories based on similarity between said filed pattern and a category center pattern by ranking term match grade to determine a term set category or by ranking the similarity between filed patterns and category center patterns to determine the closest category center pattern and the highest ranked value and second highest ranked value are used to determine an inference rule truth grade; means for classifying hierarchically said query pattern using fuzzy logic inference based on similarity between said query pattern and a category center; and means for ranking the closest filed document to the query document based on inferenced data fidelity, where the difference between unity and the bounded difference between the MAX of the second highest values and the MIN of the highest values is used to determine a category inference rule truth grade.
-
-
10. In a closest document retrieval system finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and the query document into patterns whose elements are term match grades obtained by fuzzification of a full-text search; means for categorizing hierarchically said filed patterns into categories based on similarity between said filed pattern and a category center pattern; means for classifying hierarchically said query pattern using fuzzy logic inference based on similarity between said query pattern and a category center where said means for classifying hierarchically classifies categories in layers and comprises transitive fuzzy-logic inference processor means including a bounded difference calculator to determine the bounded difference between the rule-truth grade and the term-match grades between the query document and a second term set or to determine the bounded difference between the rule truth grade and the similarity between the query pattern and a category center pattern, first MIN selectors located between the term match grade or the similarity value of a category in one layer and a previous layer, and second MIN selector for prioritizing an output of the data fidelities; and means for ranking the closest filed document to the query document based on inferenced data fidelity.
-
-
11. In a closet document retrieval system, finding from filed documents, the filed document closest to a query document based on a full-text search comprising:
-
means for converting the filed documents and the query document into patterns whose elements are term match grades obtained by fuzzification of a full text-search; means for categorizing hierarchically said filed patterns into categories based on similarly between said filed pattern and a category center pattern; means for classifying hierarchically said query pattern using fuzzy logic inference based on similarity between said query pattern and a category center by classifying categories in layers and comprising transitive fuzzy-logic inference processor means including a bounded difference calculator to determine the bounded difference between the rule-truth grade and the term-match grades between the query document and a second term set or to determine the bounded difference between the rule truth grade and the similarity between the query pattern and a category center pattern, first MIN selectors located between the term match grade or the similarity value of a category in one layer and a previous layer, and second MIN selector for prioritizing an output of the data fidelities, where when the data fidelity is positive, a lower layer category is searched, after all categories in a layer are searched the layer category to be searched is incremented, and when the data fidelity is negative, the same layer or a higher layer category is searched; and means for ranking the closest filed document to the query document based on inferenced data fidelity.
-
-
12. A closest document retrieval system to find the closest document to a query document from filed documents, based on a full-text search, comprising:
-
first means for categorizing filed documents by term match grade values of each filed document to a standard document; string search processor means for searching a filed document in an index of standard document; second means for categorizing by similarity between each section of the standard document and the filed documents; third means for categorizing documents in a single category by similarity between a typical document in the single category and the filed documents in the single category; and storage means for storing categorized documents as filed patterns with an address code whose bits correspond to the higher layer categorization codes.
-
-
13. A closest document retrieval system to find the closest document to a query document from filed documents, based on a full-text search, comprising:
-
first means for categorizing filed documents by term match grade values of each filed document to a standard document; string search processor means for searching a filed document in an index of standard document; second means for categorizing by similarity between each section of the standard document and the filed documents; third means for categorizing documents in a single category by similarity between a typical document in the single category and the filed documents in the single category; and storage means for storing categorized documents as filed patterns with an address code whose bits correspond to the higher layer categorization codes; where the category in which a filed document is categorized is determined by the category of the top ranked term match grade value or similarity stored after the filed document is compared with a term set or center pattern of each category in a layer and where the category inference rule truth grade is determined from a complement of the difference between the minimum value of the highest ranked term match grade or the similarity and the maximum value of the next highest ranked term match grade amount or similarity, where ranking is performed in a ranker FIFO.
-
-
14. A closest document retrieval system to find the closest document to a query document from filed documents, based on a full-text search, comprising:
-
first means for categorizing filed documents by term match grade values of each filed document to a standard document; string search processor means for searching a filed document in an index of standard document; second means for categorizing by similarity between each section of the standard document and the filed documents; third means for categorizing documents in a single category by similarity between a typical document in the single category and the filed documents in the single category; and storage means for storing categorized documents as filed patterns with an address code whose bits correspond to the higher layer categorization codes; where a category is divided into sub-categories and the similarity of each filed document in the sub-category center is calculated using a weight memory and average calculator means for determining the difference between the ith term match grades for the filed document and sub-category center document.
-
-
15. A closest document retrieval system to find the closest document to a query document from filed documents, based on a full-text search, comprising:
-
first means for categorizing filed documents by term match grade values of each filed document to a standard document; string search processor means for searching a filed document in an index of standard document; second means for categorizing by similarity between each section of the standard document and the filed documents; third means for categorizing documents in a single category by similarity between a typical document in the single category and the filed documents in the single category; storage means for storing categorized documents as filed patterns with an address code whose bits correspond to the higher layer categorization codes; membership function memory means which provides as an output an element of a center pattern or filed pattern term match grade when each term match signal frequency stored in a count buffer memory means by adding a match signal to the contents of the memory at each term address; and a string search processor means for searching a center document or filed document which is input with said membership function memory means and which is stored in a cluster center pattern memory or a filed pattern memory. - View Dependent Claims (16)
-
-
17. A closest document retrieval system for finding the closest document from filed documents to a query document based on a full-text search comprising:
-
string search processor means storing term sets for each category of document on a highest layer for searching the query document and for converting the query document in a query pattern whose elements are term match grades where the term match grade value is calculated and provided to a transitive fuzzy-logic inference processor, and for classifying the query pattern into a category for providing a positive bounded difference between a rule truth grade value and a term match grade value in said transitive fuzzy-logic inference processor; when the highest layer category has a positive data fidelity, the next layer category center patterns are compared with the query pattern and the similarity between the query pattern and the category center pattern is input to the transitive fuzzy-logic inference processor along with the rule truth grade provided from a rule truth grade memory with an address as the category center code which is used to determine a positive bounded difference between the rule-truth grade and the similarity; when the bounded difference is negative, a lower layer category center patterns or filed patterns in a lower layer category are not scanned by the string search processor; when the bounded difference is positive, the center category patterns are compared with the query pattern until comparison with filed patterns in the lowest layer category is completed; and ranker FIFO means for ranking data fidelity based on minimum values of the bounded differences and corresponding filed pattern address codes and for storing the closest document addresses in order of data fidelity magnitude. - View Dependent Claims (18, 19, 20)
-
Specification