×

Learning Topics By Simulation Of A Stochastic Cellular Automaton

  • US 20160350411A1
  • Filed: 11/04/2015
  • Published: 12/01/2016
  • Est. Priority Date: 05/29/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for identifying sets of correlated words comprising:

  • receiving information for a set of documents;

    wherein the set of documents comprises a plurality of words;

    wherein a particular document of the set of documents comprises a particular word of the plurality of words;

    running an inference algorithm over a Dirichlet distribution of the plurality of words in the set of documents to produce sampler result data, further comprising;

    retrieving a first counter value from a first data structure,based, at least in part, on the first counter value, assigning a particular topic, of a plurality of topics, to the particular word in the particular document to produce a topic assignment for the particular word,after assigning the particular topic to the particular word, updating a second counter value in a second data structure,wherein the second counter value reflects the topic assignment, andwherein the first data structure is distinct from the second data structure; and

    determining, from the sampler result data, one or more sets of correlated words;

    wherein the method is performed by one or more computing devices.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×