Latent semantic analysis for application in a question answer system
First Claim
1. A method for estimating similarity between concepts, the method comprising:
- receiving a set of concepts related to a corpus of text documents;
creating a representative graph structure having graph nodes each representing a latent semantic analysis (LSA) vector associated with a concept, and a node having one or more graph edges, each graph edge representing a strength of a relation between concepts based on an ontology; and
deriving, for a concept, a new or modified vector represented by a node in the graph by propagating the LSA vectors against said graph structure, said new or modified vector representing a modified estimated similarity between concepts,wherein a programmed processor device is configured to perform said receiving, creating and deriving.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method that improves obtaining similarity measure between concepts based on Latent Semantic Analysis by taking onto account graph structure derived from the knowledge bases by using a vector propagation algorithm, in the context domain, such as a medical domain. Concepts contained in a corpus of documents are expressed in a graph wherein each node is a concept and edges between node express relation between concepts weighted by the number of semantic relations determined from the corpus. A vector of neighbors is created and assigned to each concept, thereby providing an improved similarity measure between documents, i.e., corpus and query against corpus.
-
Citations
9 Claims
-
1. A method for estimating similarity between concepts, the method comprising:
-
receiving a set of concepts related to a corpus of text documents; creating a representative graph structure having graph nodes each representing a latent semantic analysis (LSA) vector associated with a concept, and a node having one or more graph edges, each graph edge representing a strength of a relation between concepts based on an ontology; and deriving, for a concept, a new or modified vector represented by a node in the graph by propagating the LSA vectors against said graph structure, said new or modified vector representing a modified estimated similarity between concepts, wherein a programmed processor device is configured to perform said receiving, creating and deriving. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification