×

Natural language processing optimized for micro content

  • US 8,938,450 B2
  • Filed: 08/30/2013
  • Issued: 01/20/2015
  • Est. Priority Date: 02/17/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for microcontent natural language processing comprising:

  • receiving a plurality of microcontent messages from a social networking server, the plurality of microcontent messages including a microcontent message;

    breaking up the microcontent message into one or more text tokens by using a tokenizer module that is configured to process micro-syntax and punctuation in the microcontent message;

    performing a part-of-speech (POS) tagging process on the text tokens to identify a linguistic category for each of the text tokens, wherein the POS tagging process for a respective text token is performed using an error-driven transformation-based tagger and based on a definition and a context of the respective text token;

    performing a topic extraction on the microcontent message to extract topic metadata for the microcontent message based on the identified linguistic category for each of the text tokens, wherein the extraction is performed without looking up a pre-specified topic in a dictionary of known entities;

    associating a topic metadata to the microcontent message based on the extracted topic;

    identifying type metadata from the microcontent message based on an ontology of predetermined microcontent types by applying a database of annotation rules to the text tokens of the microcontent message;

    associating the identified type metadata to the microcontent message;

    analyzing co-occurrence of all available metadatas in the plurality of microcontent messages;

    producing a list of trending topics for the all available metadatas based on results from the analyzing; and

    compiling a trend database by tracking how the list of trending topics changes over time.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×