Real time detection of topical changes and topic identification via likelihood based methods
First Claim
Patent Images
1. A method for real time identification of topics in text, comprising the steps of:
- forming a battery of topics from training data;
detecting topic changes in said text, using said battery and a first threshold ratio;
identifying topics in said text, using said battery and a second threshold ratio;
wherein said detecting and identifying steps are applied to said text in real time, wherein said first and second threshold ratios compare a metric having a likelihood measure, and wherein a topic can be identified for a current segment of said text prior to detecting a topic change in said segment.
2 Assignments
0 Petitions
Accused Products
Abstract
A method is disclosed for detecting topical changes and topic identification in texts in real time using likelihood ratio based methods. In accordance with the method, topic identification is achieved by evaluating text probabilities under each topic, and then selecting a new topic when one of those probabilities becomes significantly larger than the others. The method is usable to improve real time machine translation.
-
Citations
24 Claims
-
1. A method for real time identification of topics in text, comprising the steps of:
-
forming a battery of topics from training data; detecting topic changes in said text, using said battery and a first threshold ratio; identifying topics in said text, using said battery and a second threshold ratio; wherein said detecting and identifying steps are applied to said text in real time, wherein said first and second threshold ratios compare a metric having a likelihood measure, and wherein a topic can be identified for a current segment of said text prior to detecting a topic change in said segment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus for real time identification of topics in text, comprising:
-
means for forming a battery of topics from training data; means for detecting topic changes in said text, using said battery and a first threshold ratio; means for identifying topics in said text, using said battery and a second threshold ratio; wherein said detecting and identifying means are applied to said text in real time, wherein said first and second threshold ratios compare a metric having a likelihood measure, and wherein a topic can be identified for a current segment of said text prior to detecting a topic change in said segment. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification