System, method, and computer program product for searching summaries of online reviews of products
First Claim
1. A computer-implemented method comprising:
- electronically retrieving product reviews for a product from at least one online data source, each product review comprising at least one statement;
storing the product reviews in an electronic database;
identifying a plurality of product features of the product;
storing the plurality of product features in an electronic database;
tagging each word in each statement with a part of speech tag and with a polarity value, wherein the tagging each word with a part of speech tag comprises;
performing tokenization to break the statement into words or tokens;
determining the lemma for each word or token to produce tokenized text and determining a part of speech for each lemma; and
assembling the tokenized text into a tokenized sentence; and
wherein the tagging each word with a polarity value comprises;
vectorizing each of the statement, wherein each word in each of the statements that are tagged with at least one specified value is represented as a feature in a numeric form;
classifying each of the statements in the product reviews as positive, neutral, or negative based on the features in numeric form by determining the polarity value of each statement by comparing the words or tokens with a sentiment dictionary to determine whether a word is positive, negative or neutral;
further classifying at least one of the positive statements as very positive and at least one of the negative statements as very negative based on the positive, neutral, or negative classification, a confidence score, and the statement features in numeric form by assigning an intensity to the polarity value by assessing modifiers in the tokenized sentence;
analyzing the positive, very positive, negative, and very negative statements by a set of support vector machine classifiers to extract the product features in each sentence;
calculating an average score for each of the product features based on the analysis of the positive, very positive, negative, and very negative statements; and
transmitting to a user, the average scores for each product feature of the product.
3 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and computer program product for researching online reviews to assess the performance and functionality of digital media consumer products bought online or not (e.g. eBooks, movies, TV shows, music, DVD'"'"'s, etc.). The system extracts reviews from multiple online sources, including online “stores”, professional articles, blogs, online magazines, websites, etc.; and, utilizes sentiment analysis algorithms and supervised machine learning analysis to present more informative summaries for each product'"'"'s reviews, wherein each summary includes a sentence that encapsulates a sentiment held by many users; the most positive and negative comments; and a list of features with average scores (e.g. performance, price, etc.). Additionally, the user may view a separate review detail page per product that provides further summaries, such as a short list of other products that the same reviewer gave a very positive review for the features. The user is then able to purchase the product via a link.
32 Citations
5 Claims
-
1. A computer-implemented method comprising:
-
electronically retrieving product reviews for a product from at least one online data source, each product review comprising at least one statement; storing the product reviews in an electronic database; identifying a plurality of product features of the product; storing the plurality of product features in an electronic database; tagging each word in each statement with a part of speech tag and with a polarity value, wherein the tagging each word with a part of speech tag comprises; performing tokenization to break the statement into words or tokens; determining the lemma for each word or token to produce tokenized text and determining a part of speech for each lemma; and assembling the tokenized text into a tokenized sentence; and wherein the tagging each word with a polarity value comprises; vectorizing each of the statement, wherein each word in each of the statements that are tagged with at least one specified value is represented as a feature in a numeric form; classifying each of the statements in the product reviews as positive, neutral, or negative based on the features in numeric form by determining the polarity value of each statement by comparing the words or tokens with a sentiment dictionary to determine whether a word is positive, negative or neutral; further classifying at least one of the positive statements as very positive and at least one of the negative statements as very negative based on the positive, neutral, or negative classification, a confidence score, and the statement features in numeric form by assigning an intensity to the polarity value by assessing modifiers in the tokenized sentence; analyzing the positive, very positive, negative, and very negative statements by a set of support vector machine classifiers to extract the product features in each sentence; calculating an average score for each of the product features based on the analysis of the positive, very positive, negative, and very negative statements; and transmitting to a user, the average scores for each product feature of the product. - View Dependent Claims (2, 3, 4, 5)
-
Specification