Computing the relevance of a document to concepts not specified in the document
First Claim
1. A computer program product for conceptual analysis of a document, the computer program product comprising:
- a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit to perform a method comprising;
accessing a concept graph that includes a plurality of nodes and edges, each node representing a concept and each edge representing a known relation between two concepts; and
computing a relevance of the document to concepts in the concept graph, the computing comprising;
receiving a priori information about the document including concepts extracted from the document, probabilities of each of the concepts extracted from the document being located in a pool of concepts extracted from the document, and confidence scores corresponding to each of the concepts extracted from the document, wherein the concepts extracted from the document include a subset of the concepts in the concept graph; and
combining the a priori information and the concept graph to generate a posteriori information that includes a reverse index that indicates a likelihood that the document is related to each of the concepts in the concept graph, the reverse index accessible to a computer system for determining a relevance of the document to a query received from an agent external to the computer system,the combining including calculating a relevance of the document to concepts in the graph not extracted from the document based on a degree of association in the concept graph between the concepts extracted from the document and the concepts in the graph not extracted from the document.
1 Assignment
0 Petitions
Accused Products
Abstract
According to an aspect, conceptual analysis of a document includes accessing a concept graph that includes a plurality of nodes and edges. Each node represents a concept and each edge represents a known relation between two concepts. Conceptual analysis of the document further includes computing a relevance of the document to concepts in the concept graph. The computing includes receiving a priori information about the document including concepts extracted from the document. The concepts extracted from the document include a subset of the concepts in the concept graph. The computing also includes combining the a priori information and the concept graph to generate a posteriori information that indicates a likelihood that the document is related to each of the concepts in the concept graph.
-
Citations
5 Claims
-
1. A computer program product for conceptual analysis of a document, the computer program product comprising:
-
a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit to perform a method comprising; accessing a concept graph that includes a plurality of nodes and edges, each node representing a concept and each edge representing a known relation between two concepts; and computing a relevance of the document to concepts in the concept graph, the computing comprising; receiving a priori information about the document including concepts extracted from the document, probabilities of each of the concepts extracted from the document being located in a pool of concepts extracted from the document, and confidence scores corresponding to each of the concepts extracted from the document, wherein the concepts extracted from the document include a subset of the concepts in the concept graph; and combining the a priori information and the concept graph to generate a posteriori information that includes a reverse index that indicates a likelihood that the document is related to each of the concepts in the concept graph, the reverse index accessible to a computer system for determining a relevance of the document to a query received from an agent external to the computer system, the combining including calculating a relevance of the document to concepts in the graph not extracted from the document based on a degree of association in the concept graph between the concepts extracted from the document and the concepts in the graph not extracted from the document. - View Dependent Claims (2, 3, 4)
-
-
5. A system for conceptual analysis of a document, the system comprising:
-
a memory having computer readable computer instructions; and a processor for executing the computer readable instructions, the computer readable instructions including; accessing a concept graph that includes a plurality of nodes and edges, each node representing a concept and each edge representing a known relation between two concepts; and computing a relevance of the document to concepts in the concept graph, the computing comprising; receiving a priori information about the document including concepts extracted from the document, probabilities of each of the concepts extracted from the document being located in a pool of concepts extracted from the document, and confidence scores corresponding to each of the concepts extracted from the document, wherein the concepts extracted from the document include a subset of the concepts in the concept graph; and combining the a priori information and the concept graph to generate a posteriori information that includes a reverse index that indicates a likelihood that the document is related to each of the concepts in the concept graph, the reverse index accessible to a computer system for determining a relevance of the document to a query received from an agent external to the computer system, the combining including calculating a relevance of the document to concepts in the graph not extracted from the document based on a degree of association in the concept graph between the concepts extracted from the document and the concepts in the graph not extracted from the document.
-
Specification