Method and apparatus for mapping multiword expressions to identifiers using finite-state networks
First Claim
1. A method for mapping multiword expressions to identifiers using finite-state networks, comprising:
- encoding each of a plurality of multiword expressions into a regular expression;
each regular expression encoding a base form common to a plurality of derivative forms defined by ones of the multiword expressions;
compiling with factorization each of the plurality of regular expressions into a set of finite-state networks;
performing a union of the finite-state networks in the set of finite-state networks to define a multiword finite-state network and a set of subnets;
traversing the multiword finite-state network and the set of subnets to identify a path corresponding to one of the plurality of multiword expressions;
wherein said traversing accounts for only transitions originating from the multiword finite-state network to ascertain a path number identifying a base form of the one of the plurality of multiword expressions.
3 Assignments
0 Petitions
Accused Products
Abstract
Multiword expressions are mapped to identifiers using finite-state networks. Each of a plurality of multiword expressions is encoded into a regular expression. Each regular expression encodes a base form common to a plurality of derivative forms defined by ones of the multiword expressions. Each of the plurality of regular expressions is compiled with factorization into a set of finite-state networks. A union of the finite-state networks in the set of finite-state networks is performed to define a multiword finite-state network and a set of subnets. The multiword finite-state network and the set of subnets are traversed to identify a path corresponding to one of the plurality of multiword expressions, wherein only transitions originating from the multiword finite-state network are accounted for to ascertain a path number identifying a base form of the one of the plurality of multiword expressions.
50 Citations
20 Claims
-
1. A method for mapping multiword expressions to identifiers using finite-state networks, comprising:
-
encoding each of a plurality of multiword expressions into a regular expression;
each regular expression encoding a base form common to a plurality of derivative forms defined by ones of the multiword expressions;
compiling with factorization each of the plurality of regular expressions into a set of finite-state networks;
performing a union of the finite-state networks in the set of finite-state networks to define a multiword finite-state network and a set of subnets;
traversing the multiword finite-state network and the set of subnets to identify a path corresponding to one of the plurality of multiword expressions;
wherein said traversing accounts for only transitions originating from the multiword finite-state network to ascertain a path number identifying a base form of the one of the plurality of multiword expressions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for mapping multiword expressions to identifiers using finite-state networks, comprising:
-
means for encoding each of a plurality of multiword expressions into a regular expression;
each regular expression encoding a base form common to a plurality of derivative forms defined by ones of the multiword expressions;
means for compiling with factorization each of the plurality of regular expressions into a set of finite-state networks;
means for performing a union of the finite-state networks in the set of finite-state networks to define a multiword finite-state network and a set of subnets;
means for traversing the multiword finite-state network and the set of subnets to identify a path corresponding to one of the plurality of multiword expressions;
wherein said traversing means accounts for only transitions originating from the multiword finite-state network to ascertain a path number identifying a base form of the one of the plurality of multiword expressions. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An article of manufacture for use in a machine, comprising:
-
a) a memory;
b) instructions stored in the memory for mapping multiword expressions to identifiers using finite-state networks, the instructions adapted to perform a method comprising;
encoding each of a plurality of multiword expressions into a regular expression;
each regular expression encoding a base form common to a plurality of derivative forms defined by ones of the multiword expressions;
compiling with factorization each of the plurality of regular expressions into a set of finite-state networks;
performing a union of the finite-state networks in the set of finite-state networks to define a multiword finite-state network and a set of subnets;
traversing the multiword finite-state network and the set of subnets to identify a path corresponding to one of the plurality of multiword expressions;
wherein said traversing accounts for only transitions originating from the multiword finite-state network to ascertain a path number identifying a base form of the one of the plurality of multiword expressions. - View Dependent Claims (20)
-
Specification