×

Class description generation for clustering and categorization

  • US 7,813,919 B2
  • Filed: 12/20/2005
  • Issued: 10/12/2010
  • Est. Priority Date: 12/20/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for characterizing a class of a probabilistic classifier or clustering system that classifies or clusters documents into classes and includes probabilistic model parameters, the method comprising:

  • for each of a plurality of candidate words or word combinations wherein the candidate words or word combinations include natural language phrases, computing a divergence element of the class from each of a plurality of other classes of the probabilistic classifier or clustering system based on one or more probabilistic model parameters profiling the candidate word or word combination; and

    selecting one or more words or word combinations including at least one natural language phrase for characterizing the class as those candidate words or word combinations for which the class has a substantial computed divergence element from at least one of the plurality of other classes of the probabilistic classifier or clustering system that is effective for distinguishing the class from at least one of the plurality of other classes; and

    labeling the class based on the selected one or more words or word combinations, the labeling including constructing a semantic description of the class based on the at least one selected natural language phrase;

    wherein the computing a divergence operation and the selecting operation and the labeling operation are performed by a digital processor.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×