×

Method and apparatus for recognizing multiword expressions

  • US 7,346,511 B2
  • Filed: 12/13/2002
  • Issued: 03/18/2008
  • Est. Priority Date: 12/13/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for identifying multiword expressions in an input string, comprising:

  • morphologically analyzing words of the input string to replace words identified in the input string with their alternative base forms and parts of speech;

    using the analyzed words of the input string to compile the input string into a first finite-state network;

    matching the first finite-state network with a second finite-state network of multiword expressions to identify all subpaths of the first finite-state network that match one or more complete paths in the second finite-state network;

    each matching subpath of the first finite-state network and complete path of the second finite-state network identifying a multiword expression in the input string;

    wherein the analyzing is performed without disambiguating words in the input string to compile the first finite-state network with at least one path that identifies alternative base forms or parts of speech of a word in the input string; and

    wherein said matching comprises;

    i) generating a set of states comprising one state from the first finite-state network and one state from the second finite-state network;

    ii) pushing at least the set of states onto a stack, in order to record start states of potentially matching subnetworks of the first and second finite-state networks; and

    iii) recording start states of potentially matching subnetworks of the first and second finite-state networks.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×