×

Modeling Topics Using Statistical Distributions

  • US 20090094233A1
  • Filed: 10/01/2008
  • Published: 04/09/2009
  • Est. Priority Date: 10/05/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • accessing a corpus stored in one or more tangible media, the corpus comprising a plurality of documents, a document comprising a plurality of words;

    selecting one or more words of each document as one or more keywords of the each document;

    clustering the documents according to the keywords to yield a plurality of clusters, each cluster corresponding to a topic;

    generating a statistical distribution for each cluster from a subset of the words of the documents of the each cluster to yield a plurality of statistical distributions; and

    modeling each topic using the statistical distribution generated for the cluster corresponding to the each topic.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×