×

Domain dictionary creation by detection of new topic words using divergence value comparison

  • US 8,386,240 B2
  • Filed: 06/10/2011
  • Issued: 02/26/2013
  • Est. Priority Date: 08/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • determining a topic divergence value, the topic divergence value proportional to a ratio of a first topic word distribution in a first collection of documents to a second topic word distribution in a second collection of documents, wherein the first collection of documents is a collection of topic documents related to a particular topic, and the second collection of documents is a collection of documents that includes other documents related to other topics;

    determining a candidate topic word divergence value for a candidate topic word, the candidate topic word divergence value proportional to a ratio of a first distribution of the candidate topic word in the first collection of documents to a second distribution of the candidate topic word in the second collection of documents, wherein the candidate topic word is a candidate for being identified as a new topic word for the particular topic and is not a word in a topic dictionary for the particular topic; and

    determining whether the candidate topic word is a new topic word for the particular topic based on the candidate topic word divergence value and the topic divergence value.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×