×

Learning topics by simulation of a stochastic cellular automaton

  • US 10,394,872 B2
  • Filed: 11/04/2015
  • Issued: 08/27/2019
  • Est. Priority Date: 05/29/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for identifying sets of correlated words comprising:

  • receiving information for a set of documents;

    wherein the set of documents comprises a plurality of words;

    wherein a particular document of the set of documents comprises a particular word of the plurality of words;

    running an inference algorithm over a Dirichlet distribution of the plurality of words in the set of documents to produce sampler result data, further comprising;

    retrieving a first counter value from a first data structure,based, at least in part, on the first counter value, assigning a particular topic, of a plurality of topics, to the particular word in the particular document to produce a topic assignment for the particular word,after assigning the particular topic to the particular word, updating a second counter value in a second data structure to produce an updated second counter value,wherein the updated second counter value reflects the topic assignment, andwherein the first data structure is stored and accessed independently from the second data structure; and

    determining, from the sampler result data, one or more sets of correlated words;

    wherein the method is performed by one or more computing devices.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×