×

Method and apparatus for mapping multiword expressions to identifiers using finite-state networks

  • US 7,552,051 B2
  • Filed: 12/13/2002
  • Issued: 06/23/2009
  • Est. Priority Date: 12/13/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for mapping multiword expressions to identifiers using finite-state networks, comprising:

  • encoding each of a plurality of multiword expressions into a regular expression;

    each regular expression encoding a base form common to a plurality of derivative forms defined by ones of the multiword expressions;

    compiling with factorization each of the plurality of regular expressions into a set of finite-state networks;

    performing a union of the finite-state networks in the set of finite-state networks to define a multiword finite-state network and a set of subnets, each subset comprising a distinct standard network having an associated distinct start state and an associated final state comprising an entry and exit point, respectively, comprising arcs and states distinct from arcs and states of the finite-state network;

    traversing the multiword finite-state network and the set of subnets to identify a path corresponding to one of the plurality of multiword expressions;

    wherein said traversing accounts for only transitions originating from the multiword finite-state network to ascertain a path number identifying a base form of the one of the plurality of multiword expressions; and

    wherein said factorization comprises inserting an arc with a label in the multiword finite-state network at each appearance of a repeating subnet in the set of subnets, where each label is a reference to a subnet in the set of subnets.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×