TARGET BASED INDEXING OF MICRO-BLOG CONTENT
First Claim
Patent Images
1. A system, comprising:
- one or more processors; and
memory, communicatively coupled to the one or more processors,a data extraction module stored in the memory and executable by the processor to;
pre-process a micro-blog entry; and
extract data from the micro-blog entry based at least in part on one or more natural language processing technologies, the one or more natural language processing technologies including named entity recognition (NER) to locate and classify elements in the micro-blog entry into predefined categories, the NER comprising a combination of a k-nearest neighbor (KNN) classifier with a conditional random field (CRF) labeler;
a classification module stored in the memory and executable by the processor to classify the micro-blog entry into pre-defined categories; and
an index module stored in the memory and executable by the processor to;
index the extracted data and the micro-blog entry;
receive a request; and
provide the extracted data and the micro-blog entry based on the request.
2 Assignments
0 Petitions
Accused Products
Abstract
Target based indexing of micro-blog content may include extracting, labeling, and indexing data contained in micro-blog entries. For example, by adapting natural language processing (NLP) technologies to a micro-blog entry, data is extracted in order to create an index. In one embodiment, a search engine may access the index in order to return results of a search query. In another embodiment, a user interface may display micro-blog entries categorically, allowing the user to access micro-blog entries by event, quote, opinion, or other category.
-
Citations
20 Claims
-
1. A system, comprising:
-
one or more processors; and memory, communicatively coupled to the one or more processors, a data extraction module stored in the memory and executable by the processor to; pre-process a micro-blog entry; and extract data from the micro-blog entry based at least in part on one or more natural language processing technologies, the one or more natural language processing technologies including named entity recognition (NER) to locate and classify elements in the micro-blog entry into predefined categories, the NER comprising a combination of a k-nearest neighbor (KNN) classifier with a conditional random field (CRF) labeler; a classification module stored in the memory and executable by the processor to classify the micro-blog entry into pre-defined categories; and an index module stored in the memory and executable by the processor to; index the extracted data and the micro-blog entry; receive a request; and provide the extracted data and the micro-blog entry based on the request. - View Dependent Claims (2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
4. (canceled)
-
5. (canceled)
-
16. A method comprising:
-
under control of one or more processors; generating one or more indexes of micro-blog entries based at least in part on one or more natural language processing technologies including named entity recognition (NER), the NER comprising a combination of a k-nearest neighbor (KNN) classifier with a conditional random field (CRF) labeler; receiving, at a processing server, a search query; processing the search query against the one or more indexes of micro-blog entries, the indexes being configured to search the micro-blog entries based on a category associated with each micro-blog entry; surfacing categories of micro-blogs related to the search query; and making the categories available for access or display. - View Dependent Claims (17, 18)
-
-
19. One or more computer readable storage media encoded with instructions that, when executed, direct a computing device to perform operations comprising:
-
repeatedly downloading micro-blog entries; filtering the micro-blog entries based on a number of terms in each entry; applying named entity recognition to locate and classify elements in each entry into pre-defined categories, the named entity recognition comprising a combination of a k-nearest neighbor (KNN) classifier with a conditional random field (CRF) labeler; applying semantic role labeling to identify each predicate in the micro-blog entries and an argument associated with each predicate in order to assign a label to each entry; applying sentiment analysis to determine an opinion of a request and classify an opinion of each entry based on its relation to the opinion in the request; indexing the pre-defined categories, the label, and the opinion associated with each entry; receiving a search query; in response to receiving the search query; returning search results based on the indexing, the search results including both the micro-blog entries and the pre-defined categories, the label, and the opinion associated with each entry; and making the search results available to a web application. - View Dependent Claims (20)
-
Specification