Method and apparatus for compiling context-dependent rewrite rules and input strings
First Claim
1. A method for compiling a context dependent rewrite rule having a right context portion, a left context portion, a replaced portion and a replacement portion into a finite-state transducer capable of rewriting an input string into an output string based on the context dependent rewrite rule, comprising:
- forming a right context marking finite-state transducer based on the right context portion, the right context marking finite-state transducer capable of inserting a right context marker into the input string before each occurrence of the right context portion in the input string;
forming a replaced portion marking finite-state transducer based on the replaced portion and the right context marker, the replaced portion marking finite-state transducer capable of inserting a pair of left context markers into the input string before an occurrence in the input string of the replaced portion when that occurrence of the replaced portion is followed by the right context marker;
forming a replace finite-state transducer based on the replaced portion, the right context marker and the left context markers, the replace finite-state transducer capable of replacing an occurrence in the input string of the replaced portion with the replacement portion when that occurrence of the replaced portion is preceded by the left context markers and is followed by the right context marker;
forming a left context presence verifying finite-state transducer based on the left context portion, the left context presence verifying finite-state transducer capable of accepting the input string containing the replacement portion in place of the replaced portion when a first left marker symbol is preceded by the left context portion in the input string; and
forming a left context absence verifying finite-state transducer based on the left context portion, the left context absence verifying finite-state transducer capable of rejecting the input string containing the replacement portion in place of the replaced portion when a second left marker symbol is not preceded by the left context portion in the input string.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for compiling weighted context-dependent rewrite rules into weighted finite-state transducers introduces context marking symbols only when and where they are needed. In particular, the compiler and compiling method use a composition of five simple finite-state transducers generated from a weighted context-dependent rewrite rule to represent that rule. An "r" finite-state transducer is generated from the right context portion ρ of the weighted context-dependent rewrite rule. An "f" finite-state transducer is generated from the rewritten portion φ of the weighted context-dependent rewrite rule. A "Replace" finite-state transducer is generated from the rewritten and replacement portions φ and ψ of the weighted context-dependent rewrite rule. Finally, "l1 " and "l2 " finite-state transducers are generated from the left context portion λ of the weighted context-dependent rewrite rule. The "r" and "f" finite-state transducer generators of the compiler and the transducer generating steps of the compiling method introduce the context marking symbols "<1 ", "<2 " and ">" in the various finite-state transducers only when and where they are needed. The right context marker symbol ">" is added to the "r" finite-state transducer only immediately before each occurrence of ρ. The left context markers "<1 " and "<2 " are added to the "f" finite-state transducer only immediately before each occurrence of φ, neglecting any occurrences of the right marker symbol ">". The "Replace", "l1 " and "l2 " finite-state transducers then appropriately remove the right and left context markers when replacing φ with ψ, and whether λ precedes φ in the input string.
193 Citations
28 Claims
-
1. A method for compiling a context dependent rewrite rule having a right context portion, a left context portion, a replaced portion and a replacement portion into a finite-state transducer capable of rewriting an input string into an output string based on the context dependent rewrite rule, comprising:
-
forming a right context marking finite-state transducer based on the right context portion, the right context marking finite-state transducer capable of inserting a right context marker into the input string before each occurrence of the right context portion in the input string; forming a replaced portion marking finite-state transducer based on the replaced portion and the right context marker, the replaced portion marking finite-state transducer capable of inserting a pair of left context markers into the input string before an occurrence in the input string of the replaced portion when that occurrence of the replaced portion is followed by the right context marker; forming a replace finite-state transducer based on the replaced portion, the right context marker and the left context markers, the replace finite-state transducer capable of replacing an occurrence in the input string of the replaced portion with the replacement portion when that occurrence of the replaced portion is preceded by the left context markers and is followed by the right context marker; forming a left context presence verifying finite-state transducer based on the left context portion, the left context presence verifying finite-state transducer capable of accepting the input string containing the replacement portion in place of the replaced portion when a first left marker symbol is preceded by the left context portion in the input string; and forming a left context absence verifying finite-state transducer based on the left context portion, the left context absence verifying finite-state transducer capable of rejecting the input string containing the replacement portion in place of the replaced portion when a second left marker symbol is not preceded by the left context portion in the input string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A compiler that generates a set of finite-state transducers from an input context dependent rewrite rule having a right context portion, a left context portion, a replaced portion and a replacement portion, the set of finite-state transducers capable of rewriting an input string into an output string based on the context dependent rewrite rule, the compiler comprising:
-
a right context marking finite-state transducer generator that converts the right context portion into a right context marking finite-state transducer capable of inserting a right context marker into the input string before each occurrence of the right context portion in the input string; a replaced portion marking finite-state transducer generator that converts the replaced portion and the right context marker into a replaced portion marking finite-state transducer capable of inserting a pair of left context markers into the input string before an occurrence in the input string of the replaced portion when that occurrence of the replaced portion is followed by the right context marker; a replace finite-state transducer generator that converts the replaced portion, the right context marker and the left context markers into a replace finite-state transducer capable of replacing an occurrence in the input string of the replaced portion with the replacement portion when that occurrence of the replaced portion is preceded by the left context markers and is followed by the right context marker; a left context presence verifying finite-state transducer generator that converts the left context portion and a first left marker symbol into a left context presence verifying finite-state transducer capable of accepting the input string containing the replacement portion in place of the replaced portion when the first left marker symbol is preceded by the left context portion in the input string; and a left context absence verifying finite-state transducer generator that converts the left context portion and a second left marker symbol into a left context absence verifying finite-state transducer capable of rejecting the input string containing the replacement portion in place of the replaced portion when the second left marker symbol is not preceded by the left context portion in the input string. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
Specification