Document comparision using multiple similarity measures
First Claim
1. A method of comparing a plurality of documents, said method comprising:
- determining a plurality of similarity measures for said plurality of documents; and
determining an overall similarity measure for said plurality of documents, based on said plurality of similarity measures.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein is a method for comparing documents. The method includes the steps of: determining a plurality of similarity measures; and determining an overall similarity measure for the plurality of documents, based on the plurality of similarity measures. In one embodiment, the similarity measures are chosen from the group of similarity measures consisting of semantic and reference similarity measures. When comparing documents from the chemical, biochemical or pharmaceutical domains, the determination of the similarity utilizes a determination of structural similarity of the chemical formulas described in the plurality of documents.
56 Citations
19 Claims
-
1. A method of comparing a plurality of documents, said method comprising:
-
determining a plurality of similarity measures for said plurality of documents; and
determining an overall similarity measure for said plurality of documents, based on said plurality of similarity measures. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product having a computer readable medium having a computer program recorded therein for comparing documents, said computer program product comprising a method comprising:
-
determining a plurality of similarity measures for said plurality of documents; and
determining an overall similarity measure for said plurality of documents, based on said plurality of similarity measures.
-
-
13. A computer program product having a computer readable medium having a computer program recorded therein for comparing documents, said computer program product comprising a method comprising:
-
determining a reference similarity measure, based on references contained in said plurality of documents;
determining a semantic similarity measure, based on the similarity of terms contained in said plurality of documents; and
determining a similarity measure for said plurality of documents, based on said reference similarity measure and said semantic similarity measure. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
14. A knowledge retrieval system, comprising:
-
a parser for extracting required information from presented materials;
an annotator for annotating terms from said parsed presented materials, by utilizing at least one of an ontology, a taxonomy, and a dictionary;
a chemical representation device for deriving information about said annotated terms, based on a connection table;
an integrator for collating said derived information for storage in a database; and
a retrieval system for retrieving information from said database, based on input search criteria.
-
Specification