Mining topic-related aspects from user generated content
First Claim
1. A memory having computer executable instructions encoded thereon, the computer executable instructions executed by a processor to perform location-related mining operations, the operations comprising:
- identifying a particular travelogue;
decomposing the particular travelogue by identifying at least two non-overlapping segments of the particular travelogue, each segment including a representation of at least one location;
representing a collection of travelogues with a term-document matrix, the collection of travelogues comprising the particular travelogue, and each word of the particular travelogue representing;
a location, a local topic, and a term in a sequence;
ora global topic and a term in a sequence;
using a probabilistic topic model, decomposing the term-document matrix into one or more matrices comprising;
a term-local topic matrix;
a local topic-location matrix;
ora location-document matrix; and
representing a particular location by a multinomial distribution over local topics while associating a document with a multinomial distribution over global topics.
2 Assignments
0 Petitions
Accused Products
Abstract
Described herein is a technology that facilitates efficient automated mining of topic-related aspects of user generated content based on automated analysis of the user generated content. Locations are automatically learned based on dividing documents into document segments, and decomposing the segments into local topics and global topics. Techniques described herein include, for example, computer annotating travelogues with learned tags, performing topic learning to obtain an interest model, and performing location matching based on the interest model.
305 Citations
20 Claims
-
1. A memory having computer executable instructions encoded thereon, the computer executable instructions executed by a processor to perform location-related mining operations, the operations comprising:
-
identifying a particular travelogue; decomposing the particular travelogue by identifying at least two non-overlapping segments of the particular travelogue, each segment including a representation of at least one location; representing a collection of travelogues with a term-document matrix, the collection of travelogues comprising the particular travelogue, and each word of the particular travelogue representing; a location, a local topic, and a term in a sequence;
ora global topic and a term in a sequence; using a probabilistic topic model, decomposing the term-document matrix into one or more matrices comprising; a term-local topic matrix; a local topic-location matrix;
ora location-document matrix; and representing a particular location by a multinomial distribution over local topics while associating a document with a multinomial distribution over global topics. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
identifying a travelogue for location-related mining; decomposing the travelogue; representing a decomposed travelogue with a term-document matrix, wherein each word from the travelogue represents one of; a local topic;
ora global topic; selecting a candidate set of travelogues based at least on the local topic; ranking the travelogues in the candidate set of travelogues based at least on the local topic; and returning travelogues in the candidate set of travelogues based at least on the ranking. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method comprising:
-
in response to Internet browsing activities, identifying a collection of user generated content; searching an image library for images having associated descriptive data that is similar to text in the collection of user generated content; processing the descriptive data of the images to derive a topic for the collection of user generated content; selecting a recommendation based at least in part on the topic derived; and in further response to the Internet browsing activities, presenting the recommendation. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification