On-demand language processing system and method
First Claim
1. In a speech processing system wherein predetermined linguistic information is represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, a method for input speech evaluation by on-demand expansion of a selected portion of at least one of said network models comprising the steps of:
- formulating said at least one of said network models as a finite state transducer, said transducer representing said mapping function between said at least one network model and an adjacent one of said cascaded network models;
selecting a state in a one of said at least one transducers having a correspondence to said selected portion of said at least one network model;
composing said one of said at least one transducers with a next successively higher level of said network models to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said at least one network model;
iteratively repeating said selecting step and said composing step for each successively higher level network model representing said selected portion; and
evaluating an input speech signal against a candidate speech segment provided by one of said network models.
7 Assignments
0 Petitions
Accused Products
Abstract
A language recognition methodology is provided whereby any finite-state model of context may be used in a very general class of decoding cascades, and without requiring specialized decoders or full network expansion. The methodology includes two fundamental improvements: (1) a simple generalization, weighted finite-state transducers, of existing network models, and (2) a novel on-demand execution technique for network combination. With the methodology of the invention one or more of the network cascades are formulated as a finite state transducer, and is composed, for a selected portion of the network, with a next successively higher level of the network cascade to prescribe a mapped portion of that next successively higher level corresponding to the portion of the network cascade selected to be expanded.
-
Citations
21 Claims
-
1. In a speech processing system wherein predetermined linguistic information is represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, a method for input speech evaluation by on-demand expansion of a selected portion of at least one of said network models comprising the steps of:
-
formulating said at least one of said network models as a finite state transducer, said transducer representing said mapping function between said at least one network model and an adjacent one of said cascaded network models;
selecting a state in a one of said at least one transducers having a correspondence to said selected portion of said at least one network model;
composing said one of said at least one transducers with a next successively higher level of said network models to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said at least one network model;
iteratively repeating said selecting step and said composing step for each successively higher level network model representing said selected portion; and
evaluating an input speech signal against a candidate speech segment provided by one of said network models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
C.start;
=returns A.start paired with B.startC.final((s1,s2));
=for state s1,s2, returns A.final(s1) Ù
B.final(s2)C.arcs((s1,s2));
=for state s1,s2, returns Merge(A.arcs(s1), B.arcs(s2))where merged arcs are defined as;
(label 1, label 3, x+y, (next state 1, next state 2)) Î
Merge(A.arcs(s1), B.arcs(s2))iff (label 1, label 2, x, next state 1) Î
A.arcs(s1) and (label 2, label 3, y, next state 2) Î
B.arcs(s2);
where, using M to represent any of generalized state machines C, A or B; M.start returns, on request, the start state for machine M;
M.final(state) represents probability of accepting at a given state; and
M.arcs(state) returns, for a given state, transitions (a1, a2, . . . aN) leaving the given state, where ai=(input label, output label, weight &
next state).
-
-
5. The method for on-demand expansion of claim 1 further including a step of storing at least a part of said on-demand expansion of said selected portion of said network cascade in a storage cache.
-
6. The method for on-demand expansion of claim 5 wherein said storage cache is implemented as a generalized state machine of the form:
-
N.start;
=M.startN.final(state);
=M.final(state)N.arcs(state);
=M.arcs(state)where; M represents an input machine and N represents said cache machine;
M.start returns, on request, the start state for machine M;
M.final(state) represents probability of accepting at a given state; and
M.arcs(state) returns, for a given state, transitions (a1, a2, . . . aN) leaving the given state, where ai=(input label, output label, weight &
next state); and
corresponding functions of generalized state machine N are equivalently defined.
-
-
7. The method for on-demand expansion of claim 1 wherein said selected portion of said network cascade corresponds to an unknown speech fragment of interest.
-
8. The method for on-demand expansion of claim 1 further including the step of inserting a filter in said composition of said one of said at least one transducers with said next successively higher level, wherein said filter operates to remove redundant paths occurring between a pair of states in said composition.
-
9. The method for on-demand expansion of claim 8 wherein said filter is manifested as a finite state transducer.
-
10. A speech processing system operable to provide on-demand expansion of a selected portion of a linguistic model, wherein predetermined linguistic information established for training said system is represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, said system comprising:
-
decoding means for evaluating input speech signals against a candidate speech segment provided by one of said network models;
a finite state transducer operable to represent said mapping function between at least one of said network models and an adjacent one of said cascaded network models;
means for selecting a state in said transducer having a correspondence to said selected portion of said network model;
means for composing said transducer with a next successively higher level of said network model to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said network model; and
means for iteratively applying said transducer and said composing means to successively higher level network models representing said selected portion. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
C.start;
=returns A.start paired with B.startC.final((s1,s2));
=for state s1,s2, returns A.final(s1) Ù
B.final(s2)C.arcs((s1,s2));
=for state s1,s2, returns Merge(A.arcs(s1), B.arcs(s2))where merged arcs are defined as;
(label 1, label 3, x+y, (next state 1, next state 2)) Î
Merge(A.arcs(s1), B.arcs(s2))iff (label 1, label 2, x, next state 1) Î
A.arcs(s1) and (label 2, label 3, y, next state 2) Î
B.arcs(s2);
where, using M to represent any of generalized state machines C, A or B; M.start returns, on request, the start state for machine M;
M.final(state) represents probability of accepting at a given state; and
M.arcs(state) returns, for a given state, transitions (a1, a2, . . . aN) leaving the given state, where ai=(input label, output label, weight &
next state).
-
-
14. The speech processing system of claim 10 further including a storage cache for storing at least a part of said on-demand expansion of said selected portion of said network cascade.
-
15. The speech processing system of claim 14 wherein said storage cache is implemented as a generalized state machine of the form:
-
N.start;
=M.startN.final(state);
=M.final(state)N.arcs(state);
=M.arcs(state)where; M represents an input machine and N represents said cache machine;
M.start returns, on request, the start state for machine M;
M.final(state) represents probability of accepting at a given state; and
M.arcs(state) returns, for a given state, transitions (a1, a1, . . . aN) leaving the given state, where ai=(input) label, output label, weight &
next state); and
corresponding functions of generalized state machine N are equivalently defined.
-
-
16. The speech processing system of claim 10 further characterized in that said selected portion of said network cascade corresponds to an unknown speech fragment of interest.
-
17. The speech processing system of claim 10 further including a filter means operative to remove redundant paths occurring between a pair of states in a composition created by said means for composing.
-
18. In a system for recognition of an unknown pattern in an input signal, said pattern being indicative of underlying information content, wherein data applied to train said system are represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, a method for input pattern evaluation by on-demand expansion of a selected portion of at least one of said network models comprising the steps of:
-
formulating said at least one of said network models as a finite state transducer, said transducer representing said mapping function between said at least one network model and an adjacent one of said cascaded network models;
selecting a state in a one of said at least one transducers having a correspondence to said selected portion of said at least one network model;
composing said one of said at least one transducers with a next successively higher level of said network models to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said at least one network model;
iteratively repeating said selecting step and said composing step for each successively higher level network model representing said selected portion; and
evaluating an input signal pattern against a candidate pattern segment provided by one of said network models. - View Dependent Claims (19)
-
-
20. In a system for processing linguistic information in an input signal, wherein predetermined linguistic information established for training said system is represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, a method for evaluation of input linguistic information by on-demand expansion of a selected portion of at least one of said network models comprising the steps of:
-
formulating said at least one of said network models as a finite state transducer, said transducer representing said mapping function between said at least one network model and an adjacent one of said cascaded network models;
selecting a state in a one of said at least one transducers having a correspondence to said selected portion of said at least one network model;
composing said one of said at least one transducers with a next successively higher level of said network models to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said at least one network model;
iteratively repeating said selecting step and said composing step for each successively higher level network model representing said selected portion; and
evaluating an input linguistic-information signal against a candidate linguistic-information segment provided by one of said network models.
-
-
21. In a system for processing linguistic information, wherein predetermined linguistic information established for training said system is represented in a hierarchal cascade of network models and a linkage between successive levels of said network models is defined by a mapping function, and wherein at least one path within each said network model is defined by connections among a plurality of predetermined states, a sub-system for providing on-demand expansion of a selected portion of at least one of said network models, comprising:
-
decoding means for evaluating an input linguistic-information signal against a candidate linguistic-information segment provided by one of said network models means for formulating said at least one of said network models as a finite state transducer, said transducer representing said mapping function between said at least one network model and an adjacent one of said cascaded network models;
means for selecting a state in a one of said at least one transducers having a correspondence to said selected portion of said at least one network model;
means for composing said one of said at least one transducers with a next successively higher level of said network models to prescribe a mapped portion of said next successively higher level corresponding to said selected portion of said at least one network model; and
means for iteratively applying said means for selecting and said means for composing for successively higher level network models representing said selected portion.
-
Specification