Method and apparatus for reducing the intermediate alphabet occurring between cascaded finite state transducers
First Claim
2-1. The method of claim 1, wherein the first factor and the second factor are derived by factoring an input finite-state transducer (FST).
5 Assignments
0 Petitions
Accused Products
Abstract
A method reduces the number of diacritics and other intermediate symbols occurring between two factors that result from any factorization such as extraction of infinite ambiguity, factorization of finitely ambiguous finite-state transducer, or bimachine factorization. The method a posteriori removes all redundant intermediate symbols. The method can be used with any two finite-state transducers (FSTs) that operate in a cascade. With longer cascades, the method can be applied pair-wise to all FSTs, preferably starting from the last pair.
13 Citations
17 Claims
-
2-1. The method of claim 1, wherein the first factor and the second factor are derived by factoring an input finite-state transducer (FST).
-
4. A method for removing redundant intermediate symbols from a first finite-state transducer (FST) having an output side and a second finite-state transducer (FST) having an input side, comprising the steps of:
-
identifying a plurality of non-overlapping equivalence classes of input symbols that are diacritics on the input side of the second FST;
representing each of the plurality of non-overlapping equivalence classes with a unique symbol; and
replacing on the output side of the first FST and on the input side of the second FST each occurrence of a diacritic that appears in one of the plurality of non-overlapping equivalence classes with the unique symbol that represents the corresponding equivalence class. - View Dependent Claims (5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for removing redundant intermediate symbols from a first FST having an output side and a second FST having an input side, comprising:
-
means for identifying a plurality of non-overlapping equivalence classes of input symbols that are diacritics on the input side of the second FST;
means for representing each of the plurality of non-overlapping equivalence classes with a unique symbol; and
means for replacing on the output side of the first FST and on the input side of the second FST each occurrence of a diacritic that appears in one of the plurality of non-overlapping equivalence classes with the unique symbol that represents the corresponding equivalence class. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification