×

Methods for analyzing text

  • US 9,336,192 B1
  • Filed: 11/26/2013
  • Issued: 05/10/2016
  • Est. Priority Date: 11/28/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for analyzing text, comprising:

  • providing a processor on a first computer, wherein the processor runs a content acquisition system to obtain a text document over a computer network, wherein the text document comprises a hashtag immediately preceded by one or more words and immediately followed by one or more words;

    storing the text document obtained from the content acquisition system on a storage device;

    providing a processor on the first computer which runs a text analytics engine, wherein the text analytics engine comprises a hashtag detector, a sentiment recognizer, a named entity recognizer, and a sentiment assignor;

    using the text analytics engine to access the text document and perform an analysis to generate metadata, comprising;

    having the hashtag detector recognize a hashtag and the one or more words immediately following the hashtag;

    having the named entity recognizer identify the one or more words immediately following the hashtag;

    having the sentiment recognizer select words from the one or more words immediately preceding the hashtag which have sentiment;

    assigning the one or more words having sentiment to the one or more words immediately following the hashtag;

    providing a processor on the first computer which runs a weighting multiplier; and

    measuring a repetition of letters in the one or more words having sentiment, comprising;

    preprocessing a dictionary of terms into a tree such that each letter in a word corresponds to a node in the tree, with subsequent letters corresponding to branchings in the tree, and wherein leaves of the tree point to the term preprocessed;

    where lookup is accomplished by processing of each letter in a word discovered in a novel text being processed where processing is accomplished by following the branches in the tree corresponding to subsequent letters, except that repeated letters do not follow branches but instead increment a counter recording the number of letters repeated, where the successful arrival at a leaf returns both the term discovered and the letter repetition counter; and

    the letter repetition counter is used to calculate a weight multiplier greater or equal to 1 for the term;

    storing the metadata in a database;

    providing a processor from a second computer that runs an application which accesses the database.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×