Analyzing concepts over time
First Claim
Patent Images
1. A method, in an information handling system comprising a processor and a memory, for analyzing concept vectors over time to detect changes in a corpus, the method comprising:
- generating, by the system, at least a first concept vector set V1, . . . , V′
k+b derived from a concatenation of a first set of concept sequences and a second set of concept sequences over k concepts that are shared by the first and second sets of concept sequences and b concepts that are only in the second set of concept sequences and applied to the vector learning component, where the second set of concept sequences is effectively collected after collection of the first set of concept sequences;
generating, by the system, at least a second concept vector set VL1, . . . , VLh derived from a third set of concept sequences identified in the concatenation of the first and second sets of concept sequences as being central to a specified technology area T over h concepts that are extracted from the corpus and applied to a vector learning component; and
performing, by the system, a natural language processing (NLP) analysis of the first concept vector set and second concept vector set to detect one or more disruptive concepts in the second set of concept sequences by analyzing relationship strengths between concepts to identify market trends for answering questions submitted to the information handling system, wherein analyzing relationship strengths comprises (a) computing cosine distances between each of the first vector set V′
k+1, . . . , V′
k+b and each of the second vector set VL1, . . . , VLh, and (b) sorting vectors V′
k+1, . . . , V′
k+b to identify one or more disruptive concepts based on the computed cosine distances.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for automatically generating and processing first and second concept vector sets extracted, respectively, from a first set of concept sequences and from a second, temporally separated, concept sequences by performing a natural language processing (NLP) analysis of the first concept vector set and second concept vector set to detect changes in the corpus over time by identifying changes for one or more concepts included in the first and/or second set of concept sequences.
25 Citations
20 Claims
-
1. A method, in an information handling system comprising a processor and a memory, for analyzing concept vectors over time to detect changes in a corpus, the method comprising:
-
generating, by the system, at least a first concept vector set V1, . . . , V′
k+b derived from a concatenation of a first set of concept sequences and a second set of concept sequences over k concepts that are shared by the first and second sets of concept sequences and b concepts that are only in the second set of concept sequences and applied to the vector learning component, where the second set of concept sequences is effectively collected after collection of the first set of concept sequences;generating, by the system, at least a second concept vector set VL1, . . . , VLh derived from a third set of concept sequences identified in the concatenation of the first and second sets of concept sequences as being central to a specified technology area T over h concepts that are extracted from the corpus and applied to a vector learning component; and performing, by the system, a natural language processing (NLP) analysis of the first concept vector set and second concept vector set to detect one or more disruptive concepts in the second set of concept sequences by analyzing relationship strengths between concepts to identify market trends for answering questions submitted to the information handling system, wherein analyzing relationship strengths comprises (a) computing cosine distances between each of the first vector set V′
k+1, . . . , V′
k+b and each of the second vector set VL1, . . . , VLh, and (b) sorting vectors V′
k+1, . . . , V′
k+b to identify one or more disruptive concepts based on the computed cosine distances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An information handling system comprising:
-
one or more processors; a memory coupled to at least one of the processors; a set of instructions stored in the memory and executed by at least one of the processors to analyze concept vectors over time to detect changes in a corpus, wherein the set of instructions are executable to perform actions of; generating, by the system, at least a first concept vector set V1, . . . , V′
k+b derived from a concatenation of a first set of concept sequences and a second set of concept sequences over k concepts that are shared by the first and second sets of concept sequences and b concepts that are only in the second set of concept sequences and applied to the vector learning component, where the second set of concept sequences is effectively collected after collection of the first set of concept sequences;generating, by the system, at least a second concept vector set VL1, . . . , VLh derived from a third set of concept sequences identified in the concatenation of the first and second sets of concept sequences as being central to a specified technology area T over h concepts that are extracted from the corpus and applied to a vector learning component; and performing, by the system, a natural language processing (NLP) analysis of the first concept vector set and second concept vector set to detect one or more disruptive concepts in the second set of concept sequences by analyzing relationship strengths between concepts to identify market trends for answering questions submitted to the information handling system, wherein analyzing relationship strengths comprises (a) computing cosine distances between each of the first vector set V′
k+1, . . . , V′
k+b and each of the second vector set VL1, . . . , VLh, and (b) sorting vectors V′
k+1, . . . , V′
k+b to identify one or more disruptive concepts based on the computed cosine distances. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A computer program product stored in a non-transitory computer readable storage medium, comprising computer instructions that, when executed by an information handling system, causes the system to analyze concept vectors over time to detect changes in a corpus by performing actions comprising:
-
generating, by the system, at least a first concept vector set V1, . . . , V′
k+b derived from a concatenation of a first set of concept sequences and a second set of concept sequences over k concepts that are shared by the first and second sets of concept sequences and b concepts that are only in the second set of concept sequences and applied to the vector learning component, where the second set of concept sequences is effectively collected after collection of the first set of concept sequences;generating, by the system, at least a second concept vector set VL1, . . . , VLh derived from a third set of concept sequences identified in the concatenation of the first and second sets of concept sequences as being central to a specified technology area T over h concepts that are extracted from the corpus and applied to a vector learning component; and performing, by the system, a natural language processing (NLP) analysis of the first concept vector set and second concept vector set to detect one or more disruptive concepts in the second set of concept sequences by analyzing relationship strengths between concepts to identify market trends for answering questions submitted to the information handling system, wherein analyzing relationship strengths comprises (a) computing cosine distances between each of the first vector set V′
k+1, . . . , V′
k+b and each of the second vector set VL1, . . . , VLh, and (b) sorting vectors V′
k+1, . . . , V′
k+b to identify one or more disruptive concepts based on the computed cosine distances. - View Dependent Claims (20)
-
Specification