Machine-based content analysis and user perception tracking of microcontent messages
First Claim
1. A method for microcontent natural language processing comprising:
- receiving a plurality of microcontent messages from a social networking server, the plurality of microcontent messages including a microcontent message;
breaking up the microcontent message into one or more text tokens by using a tokenizer module that is configured to process micro-syntax and punctuation in the microcontent message;
performing a part-of-speech (POS) tagging process on the text tokens to identify a linguistic category for each of the text tokens, wherein the POS tagging process for a respective text token is performed using an error-driven transformation-based tagger and based on a definition and a context of the respective text token;
performing a topic extraction on the microcontent message to extract topic metadata for the microcontent message based on the identified linguistic category for each of the text tokens, wherein the extraction is performed without looking up a pre-specified topic in a dictionary of known entities;
associating a topic metadata to the microcontent message based on the extracted topic;
generating sentiment metadata for the microcontent message by performing a sentiment analysis on the one or more text tokens to classify a sentiment of the microcontent message, wherein the sentiment analysis is based on a Naï
ve Bayesian classifier;
associating the sentiment metadata with the microcontent message;
analyzing co-occurrence of all available metadata in the plurality of microcontent messages;
producing a list that ranks the plurality of microcontent messages based on all available topic metadata and sentiment metadata associated with the plurality of microcontent messages; and
compiling a trend database that reveals how perception of users of the social networking server on a given topic changes by tracking how the list changes over time.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and a method for microcontent natural language processing are presented. The method comprising steps of receiving a microcontent message from a social networking server, tokenizing the microcontent message into one or more text tokens, performing a topic extraction on the microcontent message to extract topic metadata, generating sentiment metadata for the microcontent message, analyzing co-occurrence of all available metadatas in the plurality of microcontent messages, producing a list that ranks the plurality of microcontent messages based on all available topic metadata, and compiling a trend database that reveals how perception of users of the social networking server on a given topic changes by tracking how the list changes over time.
-
Citations
19 Claims
-
1. A method for microcontent natural language processing comprising:
-
receiving a plurality of microcontent messages from a social networking server, the plurality of microcontent messages including a microcontent message; breaking up the microcontent message into one or more text tokens by using a tokenizer module that is configured to process micro-syntax and punctuation in the microcontent message; performing a part-of-speech (POS) tagging process on the text tokens to identify a linguistic category for each of the text tokens, wherein the POS tagging process for a respective text token is performed using an error-driven transformation-based tagger and based on a definition and a context of the respective text token; performing a topic extraction on the microcontent message to extract topic metadata for the microcontent message based on the identified linguistic category for each of the text tokens, wherein the extraction is performed without looking up a pre-specified topic in a dictionary of known entities; associating a topic metadata to the microcontent message based on the extracted topic; generating sentiment metadata for the microcontent message by performing a sentiment analysis on the one or more text tokens to classify a sentiment of the microcontent message, wherein the sentiment analysis is based on a Naï
ve Bayesian classifier;associating the sentiment metadata with the microcontent message; analyzing co-occurrence of all available metadata in the plurality of microcontent messages; producing a list that ranks the plurality of microcontent messages based on all available topic metadata and sentiment metadata associated with the plurality of microcontent messages; and compiling a trend database that reveals how perception of users of the social networking server on a given topic changes by tracking how the list changes over time. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a network component configured for receiving a plurality of microcontent messages from a social networking server, the plurality of microcontent messages including a microcontent message; a processor; and a memory storing instructions which, when executed by the processor, cause the system to perform a process including; breaking up the microcontent message into one or more text tokens by using a tokenizer module that is configured to process micro-syntax and punctuation in the microcontent message; performing a part-of-speech (POS) tagging process on the text tokens to identify a linguistic category for each of the text tokens, wherein the POS tagging process for a respective text token is performed using an error-driven transformation-based tagger and based on a definition and a context of the respective text token; performing a topic extraction on the microcontent message to extract topic metadata for the microcontent message based on the identified linguistic category for each of the text tokens, wherein the extraction is performed without looking up a pre-specified topic in a dictionary of known entities; associating a topic metadata to the microcontent message based on the extracted topic; generating sentiment metadata for the microcontent message by performing a sentiment analysis on the one or more text tokens to classify a sentiment of the microcontent message, wherein the sentiment analysis is based on a Naï
ve Bayesian classifier;associating the sentiment metadata with the microcontent message; analyzing co-occurrence of all available metadata in the plurality of microcontent messages; producing a list that ranks the plurality of microcontent messages based on all available topic metadata and sentiment metadata associated with the plurality of microcontent messages; and compiling a trend database that reveals how perception of users of the social networking server on a given topic changes by tracking how the list changes over time. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification