Extraction and grouping of feature words
First Claim
1. An article of manufacture including a non-transitory computer readable storage medium to tangibly store instructions, which when executed by a computer, cause the computer to:
- obtain feature words from a first corpus of text bodies comprising a plurality of reviews;
create a second corpus using a combination of the obtained feature words, verbs and adjectives from the first corpus, wherein the second corpus comprises filtered reviews and each of the filtered reviews pertains to a review of the plurality of reviews;
preliminarily assign topics for words in the filtered reviews of the second corpus;
for the feature words in the second corpus, determine topic counts for preliminarily assigned topics;
after determining the topic counts, finally assign one or more of the topics to the feature words based on a topic count value, comprising;
assign the one or more topics that have a highest topic count value to the featuring words; and
assign the one or more topics that have a threshold topic count value relative to the highest topic count value; and
present on a user interface at least one topic as a group of the feature words for which the at least one topic is assigned based on the topic count value.
2 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments of systems and methods for extraction and grouping of feature words are described herein. Feature words are obtained from a first corpus of text bodies comprising a plurality of reviews. A second corpus is created using a combination of the obtained feature words, verbs and adjectives from the first corpus. The second corpus comprises filtered reviews and each of the filtered reviews pertains to a review. Topics are preliminarily assigned for words in the filtered reviews of the second corpus. For each of the feature words in the second corpus, a topic count is determined for every preliminarily assigned topic. After determining the topic count, one or more of the topics are finally assigned to the feature words based on a topic count value. At least one topic is presented as a group of the feature words for which the at least one topic is assigned based on the topic count value.
-
Citations
18 Claims
-
1. An article of manufacture including a non-transitory computer readable storage medium to tangibly store instructions, which when executed by a computer, cause the computer to:
-
obtain feature words from a first corpus of text bodies comprising a plurality of reviews; create a second corpus using a combination of the obtained feature words, verbs and adjectives from the first corpus, wherein the second corpus comprises filtered reviews and each of the filtered reviews pertains to a review of the plurality of reviews; preliminarily assign topics for words in the filtered reviews of the second corpus; for the feature words in the second corpus, determine topic counts for preliminarily assigned topics; after determining the topic counts, finally assign one or more of the topics to the feature words based on a topic count value, comprising; assign the one or more topics that have a highest topic count value to the featuring words; and assign the one or more topics that have a threshold topic count value relative to the highest topic count value; and present on a user interface at least one topic as a group of the feature words for which the at least one topic is assigned based on the topic count value. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method for grouping words, the method comprising:
-
obtaining feature words from a first corpus of text bodies comprising a plurality of reviews; creating a second corpus using a combination of the obtained feature words, verbs and adjectives from the first corpus, wherein the second corpus comprises filtered reviews and each of the filtered reviews pertains to a review of the plurality of reviews; preliminarily assigning topics for words in the filtered reviews of the second corpus; for the feature words in the second corpus, determining topic counts for preliminarily assigned topics; after determining the topic counts, finally assigning one or more of the topics to the feature words based on a topic count value, comprising; assigning the one or more topics that have a highest topic count value to the feature words; and assigning the one or more topics that have a threshold topic count value relative to the highest topic count value; and presenting on a user interface at least one topic as a group of the feature words for which the at least one topic is assigned based on the topic count value. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer system for grouping words, comprising:
-
a computer memory to store program code; and a processor to execute the program code to; obtain feature words from a first corpus of text bodies comprising a plurality of reviews; create a second corpus using a combination of the obtained feature words, verbs and adjectives from the first corpus, wherein the second corpus comprises filtered reviews and each of the filtered reviews pertains to a review of the plurality of reviews; preliminarily assign topics for words in the filtered reviews of the second corpus; for the feature words in the second corpus, determine topic counts for preliminarily assigned topics; after determining the topic counts, finally assign one or more of the topics to the feature words based on a topic count value, comprising; assign the one or more topics that have a highest topic count value to the feature words; and assign the one or more topics that have a threshold topic count value relative to the highest topic value; and present on a user interface at least one topic as a group of the feature words for which the at least one topic is assigned based on the topic count value. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification