Measuring topical coherence of keyword sets
First Claim
Patent Images
1. A computer-implemented method comprising:
- expanding by a processor a text string set including a plurality of text strings using search results generated in response to at least one search query including the plurality of text strings thereby resulting in an expanded text string set;
identifying frequent itemsets in the expanded text string set;
developing a vocabulary for the text string set including selected ones of the frequent itemsets;
calculating a similarity measure for each pair of the selected frequent itemsets in the vocabulary; and
generating a topical coherence measure for the text string set with reference to the similarity measures, the topical coherence measure representing topical similarity among the plurality of text strings in the text string set.
9 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus are described for measuring the topical coherence of a keyword set while simultaneously partitioning the set into contextually related clusters.
28 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
expanding by a processor a text string set including a plurality of text strings using search results generated in response to at least one search query including the plurality of text strings thereby resulting in an expanded text string set; identifying frequent itemsets in the expanded text string set; developing a vocabulary for the text string set including selected ones of the frequent itemsets; calculating a similarity measure for each pair of the selected frequent itemsets in the vocabulary; and generating a topical coherence measure for the text string set with reference to the similarity measures, the topical coherence measure representing topical similarity among the plurality of text strings in the text string set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. At least one non-transitory computer-readable medium having computer program instructions stored therein, the computer program instructions being configured to enable at least one computing device to perform steps, comprising:
-
expand a text string set including a plurality of text strings using search results generated in response to at least one search query including the plurality of text strings thereby resulting in an expanded text string set; identify frequent itemsets in the expanded text string set; develop a vocabulary for the text string set including selected ones of the frequent itemsets; calculate a similarity measure for each pair of the selected frequent itemsets in the vocabulary; and generate a topical coherence measure for the text string set with reference to the similarity measures, the topical coherence measure representing topical similarity among the plurality of text strings. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer-implemented method comprising:
-
identifying a plurality of sponsored search advertisements in response to a search query from a user, each of the sponsored search advertisements having a keyword set associated therewith; ranking by a processor each of the plurality of sponsored search advertisements with reference to a topical coherence measure for the associated keyword set, the topical coherence measure representing a topical similarity among keywords in the keyword set; and transmitting the sponsored search advertisements for presentation to the user in accordance with the ranking, of the plurality of sponsored search advertisements with reference to the topical coherence measure.
-
Specification