Latent semantic analysis for application in a question answer system
First Claim
1. A system for estimating similarity between concepts comprising:
- one or more content sources providing content;
a programmed processor device for coupling to said content sources and configured to;
receive a set of concepts related to a corpus of text documents in a content source;
create a representative graph structure having graph nodes each representing a latent semantic analysis (LSA) vector associated with a concept, and a node having one or more graph edges, each graph edge representing a strength of a relation between concepts based on an ontology; and
derive, for a concept, a new or modified vector represented by a node in the graph by propagating the LSA vectors against said graph structure, said new or modified vector representing a modified estimated similarity between concepts.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method that improves obtaining similarity measure between concepts based on Latent Semantic Analysis by taking onto account graph structure derived from the knowledge bases by using a vector propagation algorithm, in the context domain, such as a medical domain. Concepts contained in a corpus of documents are expressed in a graph wherein each node is a concept and edges between node express relation between concepts weighted by the number of semantic relations determined from the corpus. A vector of neighbors is created and assigned to each concept, thereby providing an improved similarity measure between documents, i.e., corpus and query against corpus.
30 Citations
15 Claims
-
1. A system for estimating similarity between concepts comprising:
-
one or more content sources providing content; a programmed processor device for coupling to said content sources and configured to; receive a set of concepts related to a corpus of text documents in a content source; create a representative graph structure having graph nodes each representing a latent semantic analysis (LSA) vector associated with a concept, and a node having one or more graph edges, each graph edge representing a strength of a relation between concepts based on an ontology; and derive, for a concept, a new or modified vector represented by a node in the graph by propagating the LSA vectors against said graph structure, said new or modified vector representing a modified estimated similarity between concepts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program product for estimating similarity between concepts, the computer program product comprising a tangible, non-transitory storage medium readable by a processing circuit and storing instructions run by the processing circuit for performing a method, the method comprising:
-
receiving a set of concepts related to a corpus of text documents; creating a representative graph structure having graph nodes each representing a latent semantic analysis (LSA) vector associated with a concept, and a node having one or more graph edges, each graph edge representing a strength of a relation between concepts based on an ontology; and deriving, for a concept, a new or modified vector represented by a node in the graph by propagating the LSA vectors against said graph structure, said new or modified vector representing a modified estimated similarity between concepts, wherein the storage medium readable by a processing circuit is not only a propagating signal. - View Dependent Claims (11, 12, 13, 14, 15)
-
Specification