Natural language method and system for matching and ranking documents in terms of semantic relatedness
First Claim
1. A method for matching a reference document with a plurality of corpus documents, the method comprising:
- deriving semantic content of the reference document according to a hierarchical arrangement of semantic types; and
for each corpus document, deriving semantic content of the corpus document according to the hierarchical arrangement of semantic types; and
producing a matching score for the corpus document by determining a relatedness between the corpus document and the reference document from the derived semantic content of the corpus document and the derived semantic content of the reference document.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system are provided for matching a reference document with a plurality of corpus documents. Semantic content is derived from the reference document according to a hierarchical arrangement of semantic types. For each corpus document, semantic content is also derived from the corpus document according to the hierarchical arrangement of semantic types. A matching score is produced for each corpus document by determining a relatedness between the corpus document and the reference document. This relatedness is derived from the respective semantic contents of the two documents. The corpus documents may be ranked in accordance with the determined matching scores.
161 Citations
43 Claims
-
1. A method for matching a reference document with a plurality of corpus documents, the method comprising:
-
deriving semantic content of the reference document according to a hierarchical arrangement of semantic types; and
for each corpus document, deriving semantic content of the corpus document according to the hierarchical arrangement of semantic types; and
producing a matching score for the corpus document by determining a relatedness between the corpus document and the reference document from the derived semantic content of the corpus document and the derived semantic content of the reference document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for categorizing an uncategorized document within a categorization scheme, the method comprising:
-
deriving semantic content of the reference document according to a hierarchical arrangement of semantic types;
performing a comparison of the semantic content of the uncategorized document with semantic content of documents previously categorized according to the categorization scheme; and
determining a category for the uncategorized document from the comparison. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A system for matching a reference document with a plurality of corpus documents, the system comprising:
-
a database configured for storing a hierarchical arrangement of semantic types; and
an engine in communication with the database configured to derive semantic content of the reference document and of each corpus document according to the hierarchical arrangement; and
produce a matching score between the reference document and each corpus document from the derived semantic content. - View Dependent Claims (32, 33, 34, 35, 36)
-
-
37. A system for categorizing an uncategorized document within a categorization scheme, the system comprising:
-
a database configured for storing a categorization for each of a plurality of previously categorized documents and for storing a hierarchical arrangement of semantic types; and
an engine in communication with the database configured to derive semantic content of the uncategorized document and of each of the plurality of previously categorized documents according to the hierarchical arrangement; and
compare the semantic content of the uncategorized document with the semantic content of each of the plurality of previously categorized documents to determine a category for the uncategorized document. - View Dependent Claims (38, 39, 40, 41, 42, 43)
-
Specification