System and method for domain-independent aspect level sentiment detection
First Claim
1. A method for automated sentiment analysis comprising:
- receiving, with a network interface device in a server, a first plurality of reviews from a first domain, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews;
parsing, with a processor in the server, the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship;
extracting, with the processor in the server, a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on the annotation data;
receiving, with the network interface device in the server, a second plurality of reviews from a second domain that is different from the first domain, the second plurality of reviews including no annotation data;
parsing, with the processor in the server, the second plurality of reviews from the second domain to generate a second plurality of rhetorical structure trees, each rhetorical structure tree in the second plurality of rhetorical structure trees corresponding to one review in the second plurality of reviews, each rhetorical structure tree in the second plurality of rhetorical structure trees including at least one span associated with the predetermined relationship;
generating, with the processor in the server, training data that associates at least one aspect in the review in the second plurality of reviews with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules in response to a path extracted from the rhetorical structure tree including the at least one aspect in the review in the second plurality of reviews matching the path of the rhetorical rule; and
training, with the processor in the server, a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for automated aspect-based sentiment analysis includes parsing reviews from a first domain to generate rhetorical structure trees and extracting rhetorical rules from the rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on annotation data. The method further includes parsing reviews from a second domain to generate a second plurality of rhetorical structure trees, generating training data that associates at least one aspect in the review from the second domain with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules, and training a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data.
16 Citations
14 Claims
-
1. A method for automated sentiment analysis comprising:
-
receiving, with a network interface device in a server, a first plurality of reviews from a first domain, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews; parsing, with a processor in the server, the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship; extracting, with the processor in the server, a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on the annotation data; receiving, with the network interface device in the server, a second plurality of reviews from a second domain that is different from the first domain, the second plurality of reviews including no annotation data; parsing, with the processor in the server, the second plurality of reviews from the second domain to generate a second plurality of rhetorical structure trees, each rhetorical structure tree in the second plurality of rhetorical structure trees corresponding to one review in the second plurality of reviews, each rhetorical structure tree in the second plurality of rhetorical structure trees including at least one span associated with the predetermined relationship; generating, with the processor in the server, training data that associates at least one aspect in the review in the second plurality of reviews with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules in response to a path extracted from the rhetorical structure tree including the at least one aspect in the review in the second plurality of reviews matching the path of the rhetorical rule; and training, with the processor in the server, a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for automated sentiment analysis comprising:
-
a network interface device; a memory; and a processor operatively connected to the network interface device and the memory, the processor being configured to; receive a first plurality of reviews from a first domain using the network interface device, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews; parse the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship; extract a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on the annotation data; receive a second plurality of reviews from a second domain that is different from the first domain using the network interface device, the second plurality of reviews including no annotation data; parse the second plurality of reviews from the second domain to generate a second plurality of rhetorical structure trees, each rhetorical structure tree in the second plurality of rhetorical structure trees corresponding to one review in the second plurality of reviews, each rhetorical structure tree in the second plurality of rhetorical structure trees including at least one span associated with the predetermined relationship; generate training data that associates at least one review in the second plurality of reviews with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules in response to a path extracted from the rhetorical structure tree corresponding to the at least one review in the second plurality of reviews matching the path of the rhetorical rule; and train a classifier to identify sentiments and aspects in reviews from the second domain using the second plurality of reviews and the training data, the classifier being stored in the memory for use in classifying sentiments and aspects for additional reviews in the second domain. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification