×

Method and apparatus for providing improved HMM POS tagger for multi-word entries and factoids

  • US 7,124,074 B2
  • Filed: 06/14/2005
  • Issued: 10/17/2006
  • Est. Priority Date: 07/17/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of calculating trigram path probabilities for an input string of text, the method comprising:

  • tokenizing the input string often to create a plurality of parse leaf units (PLUs);

    constructing a PosColumn for each word, multi-word-entry (MWE), factoid and character in the input swing of text which has a unique first (Ft) and last (Lt) token pair associated therewith;

    constructing all TrigramColumns corresponding to the input string of text, wherein each TrigramColumn defines a corresponding TrigramNode representing a trigram for three PosColumns in the TrigramColumn, each TrigramNode being identifiable by a unique set of three tokens;

    determining, for each TrigramColumn, all neighboring TrigramColumns to the immediate left and to the immediate right;

    calculating a forward trigram path probability, for each separate TrigramNode of each TrigramColumn, of all forward paths from a TrigramNode in a right neighboring TrigramColumn though the separate TrigramNode;

    calculating a backward trigram path probability, for each separate TrigramNode of each TrigramColumn, of all backward paths from a TrigramNode in a left neighboring TrigramColumn though the separate TrigramNode; and

    calculating sums of all trigram path probabilities though each PLU as a function of the calculated forward and backward trigram path probabilities.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×