Method and apparatus for processing sentiment-bearing text

US 7,788,086 B2
Filed: 04/14/2005
Issued: 08/31/2010
Est. Priority Date: 03/01/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method of processing text included in multiple product reviews of a single product, comprising:

utilizing a computer processor that is a component of the computer to cluster sub-document linguistic units included in a collection of relevant documents into a set of clusters based on pre-defined clustering criteria, wherein each relevant document in the collection contains text that is a review of the single product, and wherein each cluster in the set represents a different attribute of the single product, and wherein the pre-defined clustering criteria is a listing of key words defined before the computer processor clusters the sub-document linguistic units into the set of clusters, and wherein the listing of key words includes a separate group of key words for each said different attribute of the single product such that when the processor clusters the sub-document linguistic units into the set of clusters it does so by determining which of the listing of key words are included in which sub-document linguistic units;

assigning a sentiment and a confidence measure to each sub-document linguistic unit, wherein for each sub-document linguistic unit the confidence measure is a measurement of a confidence with which the sentiment was assigned;

generating a display including a direct indication of the sub-document linguistic units, the cluster in the set to which each sub-document linguistic unit was clustered by the computer processor, and the sentiment assigned to each sub-document linguistic unit;

wherein generating the display further comprises generating the display so as to also include a user input mechanism that receives user-initiated selection of a minimum confidence level that the confidence measure attributed to each sub-document linguistic unit must exceed for a sub-document linguistic unit to be included by the computer processor within any of the clusters;

excluding a particular one of the sub-document linguistic units from being included in any cluster in the set based on a determination that the confidence measure assigned to the particular sub-document linguistic unit is less than the minimum confidence level received by the user input mechanism; and

wherein generating the display further comprises generating the display so as to also include an indication of which of the listing of key words were used by the computer processor as a basis for clustering the sub-document linguistic units.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.

46 Citations

View as Search Results

9 Claims

1. A computer-implemented method of processing text included in multiple product reviews of a single product, comprising:
- utilizing a computer processor that is a component of the computer to cluster sub-document linguistic units included in a collection of relevant documents into a set of clusters based on pre-defined clustering criteria, wherein each relevant document in the collection contains text that is a review of the single product, and wherein each cluster in the set represents a different attribute of the single product, and wherein the pre-defined clustering criteria is a listing of key words defined before the computer processor clusters the sub-document linguistic units into the set of clusters, and wherein the listing of key words includes a separate group of key words for each said different attribute of the single product such that when the processor clusters the sub-document linguistic units into the set of clusters it does so by determining which of the listing of key words are included in which sub-document linguistic units;
  
  assigning a sentiment and a confidence measure to each sub-document linguistic unit, wherein for each sub-document linguistic unit the confidence measure is a measurement of a confidence with which the sentiment was assigned;
  
  generating a display including a direct indication of the sub-document linguistic units, the cluster in the set to which each sub-document linguistic unit was clustered by the computer processor, and the sentiment assigned to each sub-document linguistic unit;
  
  wherein generating the display further comprises generating the display so as to also include a user input mechanism that receives user-initiated selection of a minimum confidence level that the confidence measure attributed to each sub-document linguistic unit must exceed for a sub-document linguistic unit to be included by the computer processor within any of the clusters;
  
  excluding a particular one of the sub-document linguistic units from being included in any cluster in the set based on a determination that the confidence measure assigned to the particular sub-document linguistic unit is less than the minimum confidence level received by the user input mechanism; and
  
  wherein generating the display further comprises generating the display so as to also include an indication of which of the listing of key words were used by the computer processor as a basis for clustering the sub-document linguistic units.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 and further comprising:
    - identifying the relevant documents as a subset of documents from a larger set of documents based on key words provided by a user as input into a search engine.
  - 3. The method of claim 1, wherein generating the display further comprises:
    - generating the display so as to include a box for each cluster in the set, wherein a particular one of said boxes changes sizes in response to receipt by the input mechanism of the minimum confidence level, and wherein the receipt by the input mechanism of the minimum confidence level also triggers a change in a quantity of sub-document linguistic units represented within said particular one of the boxes.
  - 4. The method of claim 3, wherein generating the display further comprises:
    - generating the display so as to include a visual indication of an overall representative sentiment of each cluster in the set, wherein the overall representative sentiment for the cluster associated with the particular box is indicated by a shading of the particular box.
  - 5. The method of claim 1 wherein assigning a sentiment to each sub-document linguistic unit occurs before the computer processor clusters the sub-document linguistic units.

6. A computer-implemented method of generating a display that presents text associated with multiple product reviews of a single product, comprising:
- receiving from a user an indication of features of the single product for which the user desires sentiment analysis;
  
  utilizing a computer processor that is a component of the computer to cluster sub-document linguistic units of relevant documents into clusters, wherein each of the clusters corresponds to one of the features for which the user desires sentiment analysis;
  
  assigning a sentiment and a confidence measure to each sub-document linguistic unit, wherein for each sub-document linguistic unit the confidence measure is a measurement of a confidence with which the sentiment was assigned;
  
  generating the display so as to include an indication of the sub-document linguistic units, a cluster to which each sub-document linguistic unit was assigned by the computer processor, a sentiment attributed to each sub-document linguistic unit, an overall sentiment attributed to each of the clusters, and the features of the single product for which the user desires a sentiment analysis; and
  
  wherein generating the display further comprises generating the display so as to include a user input mechanism that receives a user-initiated selection of a minimum confidence level that the confidence measure attributed to each sub-document linguistic unit must exceed for a sub-document linguistic unit to be included by the computer processor within any of the clusters, and wherein generating the display further comprises generating the display so as to include a plurality of boxes, wherein at least one of the plurality of boxes changes sizes in response to receipt by the user input mechanism of the minimum confidence level.
- View Dependent Claims (7, 8, 9)
- - 7. The method of claim 6 wherein generating the display further comprises generating the display so as to include a plurality of tabs that are each associated with a particular sentiment, wherein actuating a particular one of the plurality of tabs causes sub-document linguistic units corresponding to the sentiment associated with the particular tab to be displayed in one of the plurality of boxes.
  - 8. The method of claim 7, wherein generating the display further comprises generating the display so as to indicate a change in sentiment over time.
  - 9. The method of claim 8 wherein indicating a change in sentiment over time comprises indicating a change in sentiment over time in a multidimensional display in which one dimension is time and another dimension is sentiment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Ringger, Eric K., Gamon, Michael, Corston-Oliver, Simon H., Aue, Anthony
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
SHAH, PARAS D

Application Number

US11/105,619
Publication Number

US 20060200341A1
Time in Patent Office

1,965 Days
Field of Search

704/1, 704/9, 704/10, 707/101, 707/102
US Class Current

704/9
CPC Class Codes

G06F 16/353 into predefined classes

G06F 40/253 Grammatical analysis; Style...

Method and apparatus for processing sentiment-bearing text

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

46 Citations

9 Claims

Specification

Use Cases

Quick Links

Others

Method and apparatus for processing sentiment-bearing text

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

9 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others