MINING GEOGRAPHIC KNOWLEDGE USING A LOCATION AWARE TOPIC MODEL
First Claim
1. A method in a computing device for identifying a location associated with a document, the method comprising:
- providing a collection of documents of words, each document associated with a location;
generating collection level parameters for a latent Dirichlet allocation style model for the collection of documents that based on latent topics and the location of each document, the collection level parameters indicating a probability that a document in the collection relates to each latent topic, a probability that each word of the collection relates to each latent topic, and a probability that each location of the collection relates to each latent topic; and
estimating, using the collection level parameters, a probability that a location is associated with the document based on an aggregation of, for each topic, the conditional probability of the location given the topic and the conditional probability of the topic given the document.
2 Assignments
0 Petitions
Accused Products
Abstract
Mining geographic knowledge using a location aware topic model is provided. A location system estimates topics and locations associated with documents based on a location aware topic (“LAT”) model. The location system generates the model from a collection of documents that are labeled with their associated locations. The location system generates collection level parameters based on an LDA-style model. To generate the collection level parameters, the location system estimates probabilities of latent topics, locations, and words of the collection. After the model is generated, the location system uses the collection level parameters to estimate probabilities of topics and locations being associated with target documents.
-
Citations
20 Claims
-
1. A method in a computing device for identifying a location associated with a document, the method comprising:
-
providing a collection of documents of words, each document associated with a location; generating collection level parameters for a latent Dirichlet allocation style model for the collection of documents that based on latent topics and the location of each document, the collection level parameters indicating a probability that a document in the collection relates to each latent topic, a probability that each word of the collection relates to each latent topic, and a probability that each location of the collection relates to each latent topic; and estimating, using the collection level parameters, a probability that a location is associated with the document based on an aggregation of, for each topic, the conditional probability of the location given the topic and the conditional probability of the topic given the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable medium encoded with instructions for controlling a computing device to estimate topics and locations associated with target documents, by a method comprising:
-
providing a collection of documents of words, each word of a document associated with a location; generating collection level parameters for a latent Dirichlet allocation style model for the collection of documents based on latent topics and the location of each document, the collection level parameters relating to probabilities of latent topics, locations, and words of the collection; and estimating using the collection level parameters probabilities of topics and locations being associated with each of the target documents. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computing device for determining topics and locations associated with a target document, comprising:
-
a document store having a collection of documents of words, each word of a document associated with a location; a component that generates collection level parameters for a latent Dirichlet allocation model for the collection of documents based on latent topics and the location of each document, the collection level parameters relating to probabilities of latent topics, locations, and words of the collection; and a component that estimates using the collection level parameters probabilities of topics and locations being associated with the target document. - View Dependent Claims (19, 20)
-
Specification