×

Domain Dictionary Creation

  • US 20090055381A1
  • Filed: 08/23/2007
  • Published: 02/26/2009
  • Est. Priority Date: 08/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • determining a topic divergence value, the topic divergence value substantially proportional to a ratio of a first topic word distribution in a topic document corpus to a second topic word distribution in a document corpus, wherein the topic document corpus is a corpus of topic documents related to a topic, and the document corpus is a corpus of documents that includes the topic documents and other documents;

    determining a candidate topic word divergence value for a candidate topic word, the candidate topic word divergence value substantially proportional to a ratio of a first distribution of the candidate topic Word in the topic document corpus to a second distribution of the candidate topic word in the document corpus; and

    determining whether the candidate topic word is a new topic word based on the candidate topic word divergence value and the topic divergence value.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×