SYSTEMS AND METHODS FOR CHARACTERIZING LINKED DOCUMENTS USING A LATENT TOPIC MODEL
First Claim
Patent Images
1. A method for characterizing a corpus of documents each having one or more links, comprising:
- a. forming a Bayesian network using the documents;
b. determining a Bayesian network structure using the one or more links;
c. generating a content link model; and
d. determining one or more topics in the corpus and topic distribution for each document.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for extracting characteristics from a corpus of linked documents by deriving a content link model that explicitly captures direct and indirect relations represented by the links, and extracting document topics and the topic distributions for all the documents in the corpus using the content-link model.
-
Citations
20 Claims
-
1. A method for characterizing a corpus of documents each having one or more links, comprising:
-
a. forming a Bayesian network using the documents; b. determining a Bayesian network structure using the one or more links; c. generating a content link model; and d. determining one or more topics in the corpus and topic distribution for each document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for extracting characteristics from a corpus of linked documents, comprising:
-
a. deriving a content link model that explicitly captures direct and indirect relations represented by the links, and b. extracting document topics and the topic distributions for all the documents in the corpus using the content-link model. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A system for extracting characteristics from a corpus of linked documents, comprising:
-
a. computer readable code to derive a content link model that explicitly captures direct and indirect relations represented by the links, and b. computer readable code to extract document topics and the topic distributions for all the documents in the corpus using the content-link model.
-
-
20. A system for characterizing a corpus of documents each having one or more links, comprising:
-
a. means for forming a Bayesian network using the documents; b. means for determining Bayesian network structure using the one or more links; c. means for generating a content link model; and d. means for applying the content link model to determine one or more topics in the corpus and topic distribution for each document.
-
Specification