System, method and computer program product for identifying words within collection of text applicable to specific sentiment
First Claim
1. A method for analyzing sentiment, comprising:
- at a first computer;
dividing a collection of text into a plurality of sentiment segments;
tokenizing words or phrases in the plurality of sentiment segments;
performing a frequency analysis on tokenized words or phrases in each sentiment segment of the plurality of sentiment segments;
performing a scaling operation to size individual sentiment segments based on results from the frequency analysis;
for each tokenized word or phrase in each sentiment segment of the plurality of sentiment segments, subtracting a first number of the tokenized word or phrase in the sentiment segment from a second number of the tokenized word or phrase in at least one other sentiment segment of the plurality of sentiment segments, thereby producing, for each sentiment segment of the plurality of sentiment segments, a list of words or phrases that apply specifically to the sentiment segment; and
providing the list of words or phrases that apply specifically to the sentiment segment to a second computer over a network connection.
6 Assignments
0 Petitions
Accused Products
Abstract
A content intelligence module may implement a sentiment analysis method to identify words or phrases from user-generated content that are associated with a particular sentiment. The method may comprise grouping or splitting text into different sentiment segments, tokenizing words or phrases and/or removing stopwords across the sentiment segments, performing a frequency analysis to count the words or phrases in each sentiment segment, scaling the frequency results across the sentiment segments where necessary, and removing commonly used words from the sentiment segments. The words or phrases that are left in a specific sentiment segment are the most-used words for that sentiment segment. The word cloud module therefore allows for very quick generation of a summary around sentiment segments. A sentiment overview containing the summary can be presented to a user in connection with a selected product or service with which the user-generated content is associated.
89 Citations
20 Claims
-
1. A method for analyzing sentiment, comprising:
-
at a first computer; dividing a collection of text into a plurality of sentiment segments; tokenizing words or phrases in the plurality of sentiment segments; performing a frequency analysis on tokenized words or phrases in each sentiment segment of the plurality of sentiment segments; performing a scaling operation to size individual sentiment segments based on results from the frequency analysis; for each tokenized word or phrase in each sentiment segment of the plurality of sentiment segments, subtracting a first number of the tokenized word or phrase in the sentiment segment from a second number of the tokenized word or phrase in at least one other sentiment segment of the plurality of sentiment segments, thereby producing, for each sentiment segment of the plurality of sentiment segments, a list of words or phrases that apply specifically to the sentiment segment; and providing the list of words or phrases that apply specifically to the sentiment segment to a second computer over a network connection. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising at least one non-transitory computer readable medium storing instructions translatable by a first computer to:
-
divide a collection of text into a plurality of sentiment segments; tokenize words or phrases in the plurality of sentiment segments; perform a frequency analysis on tokenized words or phrases in each sentiment segment of the plurality of sentiment segments; perform a scaling operation to size individual sentiment segments based on results from the frequency analysis; for each tokenized word or phrase in each sentiment segment of the plurality of sentiment segments, subtract a first number of the tokenized word or phrase in the sentiment segment from a second number of the tokenized word or phrase in at least one other sentiment segment of the plurality of sentiment segments, thereby producing, for each sentiment segment of the plurality of sentiment segments, a list of words or phrases that apply specifically to the sentiment segment; and provide the list of words or phrases that apply specifically to the sentiment segment to a second computer over a network connection. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A system, comprising:
-
at least one processor; at least one non-transitory computer readable medium storing instructions translatable by the at least one processor to implement a word cloud module, the word cloud module being configured to; divide a collection of text into a plurality of sentiment segments; tokenize words or phrases in the plurality of sentiment segments; perform a frequency analysis on tokenized words or phrases in each sentiment segment of the plurality of sentiment segments; perform a scaling operation to size individual sentiment segments based on results from the frequency analysis; for each tokenized word or phrase in each sentiment segment of the plurality of sentiment segments, subtract a first number of the tokenized word or phrase in the sentiment segment from a second number of the tokenized word or phrase in at least one other sentiment segment of the plurality of sentiment segments, thereby producing, for each sentiment segment of the plurality of sentiment segments, a list of words or phrases that apply specifically to the sentiment segment; and provide the list of words or phrases that apply specifically to the sentiment segment to a second computer over a network connection. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification