System And Method For Providing Robust Topic Identification In Social Indexes
First Claim
1. A computer-implemented method for providing robust topic identification in social indexes, comprising:
- maintaining electronically-stored articles and one or more indexes comprising topics that each relate to one or more of the articles;
selecting a random sampling and a selective sampling of the articles;
for each topic, identifying characteristic words comprised in the articles in each of the random sampling and the selective sampling;
determining frequencies of occurrence of the characteristic words in each of the random sampling and the selective sampling;
identifying a ratio of the frequencies of occurrence for the characteristic words comprised in the random sampling and the selective sampling; and
for each topic, building a coarse-grained topic model comprising the characteristic words comprised in the articles relating to the topic and scores assigned to those characteristic words.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method for providing robust topic identification in social indexes is described. Electronically-stored articles and one or more indexes are maintained. Each index includes topics that each relate to one or more of the articles. A random sampling and a selective sampling of the articles are both selected. For each topic, characteristic words included in the articles in each of the random sampling and the selective sampling are identified. Frequencies of occurrence of the characteristic words in each of the random sampling and the selective sampling are determined. A ratio of the frequencies of occurrence for the characteristic words included in the random sampling and the selective sampling is identified. Finally, for each topic, a coarse-grained topic model is built, which includes the characteristic words included in the articles relating to the topic and scores assigned to those characteristic words.
-
Citations
1 Claim
-
1. A computer-implemented method for providing robust topic identification in social indexes, comprising:
-
maintaining electronically-stored articles and one or more indexes comprising topics that each relate to one or more of the articles; selecting a random sampling and a selective sampling of the articles; for each topic, identifying characteristic words comprised in the articles in each of the random sampling and the selective sampling; determining frequencies of occurrence of the characteristic words in each of the random sampling and the selective sampling; identifying a ratio of the frequencies of occurrence for the characteristic words comprised in the random sampling and the selective sampling; and for each topic, building a coarse-grained topic model comprising the characteristic words comprised in the articles relating to the topic and scores assigned to those characteristic words.
-
Specification