×

KNOWLEDGE DISCOVERY FROM CITATION NETWORKS

  • US 20150186789A1
  • Filed: 12/29/2014
  • Published: 07/02/2015
  • Est. Priority Date: 12/06/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method of modeling a set of documents, each document within the set of documents comprising content, and at least a portion of the documents being linked to other documents through citations, comprising:

  • defining a Bernoulli Process Topic model for the set of documents representing each document as a represented as a mixture over latent topics, dependent on (1) an intrinsic content of each respective document, and (2) a content of other documents related to the respective document through a multi-level citation structure of direct and indirect linkages to the other documents;

    for each respective document, determining a first set of latent topic characteristics based on an intrinsic content of the respective document;

    for each document, determining a second set of latent topic characteristics based on a respective content of the other documents which are directly and indirectly linked, the indirectly linked documents contributing transitively to the latent topic characteristics of the respective document;

    representing a third set of latent topics for the respective document based on a joint probability distribution of at least the first and second sets of latent topic characteristics, Bernoulli Process Topic model; and

    outputting at least one document in response to an input, based on at least the third set of latent topics.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×