×

Method and apparatus for selecting links to include in a probabilistic generative model for text

  • US 9,418,335 B1
  • Filed: 05/04/2012
  • Issued: 08/16/2016
  • Est. Priority Date: 08/01/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • applying training documents to links in a current generative model that each connect a respective terminal node that represents a corresponding word to a respective cluster node that represents a corresponding cluster of conceptually related words, to determine a respective expected count for each of the links;

    selecting, from the links that each connect a respective terminal node that represents a corresponding word to a respective cluster node that represents a corresponding cluster of conceptually related words, a first subset of one or more links that are each associated with more than a predetermined number of sources of the training documents;

    for each selected link of the first subset, determining (i) a significance of the link, and (ii) a link rating for the link based on the expected count for the link and the significance;

    ranking the selected links of the first subset based on the link ratings;

    selecting a second subset of the ranked links; and

    generating a new generative model using only the selected links of the second subset, without using any of the links in the current generative model that were not selected for the second subset.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×