×

Method and apparatus for processing sentiment-bearing text

  • US 7,788,086 B2
  • Filed: 04/14/2005
  • Issued: 08/31/2010
  • Est. Priority Date: 03/01/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of processing text included in multiple product reviews of a single product, comprising:

  • utilizing a computer processor that is a component of the computer to cluster sub-document linguistic units included in a collection of relevant documents into a set of clusters based on pre-defined clustering criteria, wherein each relevant document in the collection contains text that is a review of the single product, and wherein each cluster in the set represents a different attribute of the single product, and wherein the pre-defined clustering criteria is a listing of key words defined before the computer processor clusters the sub-document linguistic units into the set of clusters, and wherein the listing of key words includes a separate group of key words for each said different attribute of the single product such that when the processor clusters the sub-document linguistic units into the set of clusters it does so by determining which of the listing of key words are included in which sub-document linguistic units;

    assigning a sentiment and a confidence measure to each sub-document linguistic unit, wherein for each sub-document linguistic unit the confidence measure is a measurement of a confidence with which the sentiment was assigned;

    generating a display including a direct indication of the sub-document linguistic units, the cluster in the set to which each sub-document linguistic unit was clustered by the computer processor, and the sentiment assigned to each sub-document linguistic unit;

    wherein generating the display further comprises generating the display so as to also include a user input mechanism that receives user-initiated selection of a minimum confidence level that the confidence measure attributed to each sub-document linguistic unit must exceed for a sub-document linguistic unit to be included by the computer processor within any of the clusters;

    excluding a particular one of the sub-document linguistic units from being included in any cluster in the set based on a determination that the confidence measure assigned to the particular sub-document linguistic unit is less than the minimum confidence level received by the user input mechanism; and

    wherein generating the display further comprises generating the display so as to also include an indication of which of the listing of key words were used by the computer processor as a basis for clustering the sub-document linguistic units.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×