METHOD & APPARATUS FOR IDENTIFYING A SECONDARY CONCEPT IN A COLLECTION OF DOCUMENTS
First Claim
1. A method for identifying at least one instance of a secondary concept among a plurality of documents comprising:
- creating a primary concept space from primary concept information identified in the plurality of documents;
decomposing the information contained in the primary concept space to create a secondary concept space that includes one or more secondary concepts, each of which secondary concepts is represented in the secondary concept space as a separate vector value;
creating a query and translating the query into the secondary concept space where it is represented as a query vector value;
comparing the query vector value to each of the secondary concept vector values included in the secondary concept space; and
displaying at least one secondary concept that is within a specified distance of the query vector value.
2 Assignments
0 Petitions
Accused Products
Abstract
A Methodology for identifying secondary concepts that are included in one or more documents in a collection of documents is disclosed. Training information is manually created from a subset of a collection of documents and used by a primary concept identification function to process textual information contained in the documents included in the collection of documents to identify primary concepts included in the collection of documents. Each of the primary concepts included in the collection of documents is used as input to a secondary concept identification function which results in the identification of secondary concepts included in each of the primary concepts. A query is generated and used as input to both the primary and secondary concept identification functions and the result of both the operation of both of these functions on the query is compared to the identified secondary concepts. The distance between the query and each of the secondary concepts is determined and those secondary concepts that are within a predetermined distance of the query are displayed.
33 Citations
24 Claims
-
1. A method for identifying at least one instance of a secondary concept among a plurality of documents comprising:
-
creating a primary concept space from primary concept information identified in the plurality of documents; decomposing the information contained in the primary concept space to create a secondary concept space that includes one or more secondary concepts, each of which secondary concepts is represented in the secondary concept space as a separate vector value; creating a query and translating the query into the secondary concept space where it is represented as a query vector value; comparing the query vector value to each of the secondary concept vector values included in the secondary concept space; and displaying at least one secondary concept that is within a specified distance of the query vector value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for identifying at least one instance of a secondary concept in a plurality of documents comprising:
-
training a primary concept identification function to identify one or more significant terms associated with each of one or more primary concepts in a sub-group of the plurality of documents; employing the trained primary concept identification function to detect the frequency of substantially all of the significant terms associated with each one of the one or more primary concepts in the plural documents; defining a relationship between all of the one or more significant terms and at least one of the primary concepts and storing the contents of the defined relationship as a primary concept space; processing the contents of the stored primary concept space using a secondary concept identification function to identify at least one secondary concept associated with at least one instance of a primary concept and calculating a vector value for it and storing the at least one vector value as a secondary concept vector value in a secondary concept space; creating a query and translating the query into the secondary concept space and calculating a vector value for it and storing the vector value as a query vector value in the secondary concept space; comparing the query vector value to each of the at least one secondary concept vector values; and displaying at least one secondary concept that is within a select distance of the query vector value. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. Apparatus for identifying at least one instance of a secondary concept in a plurality of documents comprising:
-
a processor; a user interface; a display device; and a storage device for storing a secondary concept identification module that operates to create a primary concept space from primary concept information identified in the plurality of documents, decompose the information contained in the primary concept space to create a secondary concept space that includes one or more secondary concepts, each of which secondary concept is represented in the secondary concept space as a separate vector value, create a query and translate the query into the secondary concept space where it is represented as a query vector value, compare the query vector value to each of the secondary concept vector values included in the secondary concept space, and display at least one secondary concept that is within a specified distance of the query vector value. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification