INFERRING EMERGING AND EVOLVING TOPICS IN STREAMING TEXT
First Claim
1. A method of inferring topic evolution and emergence in a multitude of documents, comprising:
- forming a group of matrices using text in the documents, said group of matrices including a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of said multitude of topics; and
analyzing said group of matrices to identify a first group of said multitude of topics as evolving topics and a second group of said multitude of topics as emerging topics.
0 Assignments
0 Petitions
Accused Products
Abstract
A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify evolving topics and emerging topics. The matrices includes a matrix X identifying a multitude of words in each of the documents, a matrix W identifying a multitude of topics in each of the documents, and a matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, two forms of temporal regularizers are used to help identify the evolving and emerging topics. In another embodiment, a two stage approach involving detection and clustering is used to help identify the evolving and emerging topics.
-
Citations
10 Claims
-
1. A method of inferring topic evolution and emergence in a multitude of documents, comprising:
-
forming a group of matrices using text in the documents, said group of matrices including a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of said multitude of topics; and analyzing said group of matrices to identify a first group of said multitude of topics as evolving topics and a second group of said multitude of topics as emerging topics. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification