Two-engine speech recognition
First Claim
1. A method for using exactly two speech recognition engines in automated speech recognition applications, the method comprising:
- deriving first and second alternatives matrices from respective first and second confusion matrices that are associated with first and second automated speech recognition engines, wherein said alternatives matrices each include a set of vectors for each possible hypothesis output of said automated speech recognition engines in which the ground truth entries are uniformly ordered by probability;
cross checking each hypothesis output of said automated speech recognition engines with a first pair of said ground truth entries (alt1 and alt2) in said vectors, firstly to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, and secondly to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
cross checking each hypothesis output of said automated speech recognition engines with said ground truth entries in said vectors first to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, then if none to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
incrementing and cross checking each hypothesis output of said automated speech recognition engines with a next pair of said ground truth entries (alt1 and alt2) in said vectors first to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, then to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
setting an output to equal one of said hypothesis outputs of said automated speech recognition engines if a corresponding match was found in the steps of cross checking and incrementing; and
adopting said hypothesis output of said first automated speech recognition engine if neither of the steps of cross checking and incrementing produce a match.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition system comprises exactly two automated speech recognition (ASR) engines connected to receive the same inputs. Each engine produces a recognition output, a hypothesis. The system implements one of two (or both) methods for combining the output of the two engines. In one method, a confusion matrix statistically generated for each speech recognition engine is converted into an alternatives matrix in which every column is ordered by highest-to-lowest probability. A program loop is set up in which the recognition outputs of the speech recognition engines are cross-compared with the alternatives matrices. If the output from the first ASR engine matches an alternative, its output is adopted as the final output. If the vectors provided by the alternatives matrices are exhausted without finding a match, the output from the first speech recognition engine is adopted as the final output. In a second method, the confusion matrix for each ASR engine is converted into Bayesian probability matrix.
33 Citations
7 Claims
-
1. A method for using exactly two speech recognition engines in automated speech recognition applications, the method comprising:
-
deriving first and second alternatives matrices from respective first and second confusion matrices that are associated with first and second automated speech recognition engines, wherein said alternatives matrices each include a set of vectors for each possible hypothesis output of said automated speech recognition engines in which the ground truth entries are uniformly ordered by probability;
cross checking each hypothesis output of said automated speech recognition engines with a first pair of said ground truth entries (alt1 and alt2) in said vectors, firstly to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, and secondly to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
cross checking each hypothesis output of said automated speech recognition engines with said ground truth entries in said vectors first to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, then if none to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
incrementing and cross checking each hypothesis output of said automated speech recognition engines with a next pair of said ground truth entries (alt1 and alt2) in said vectors first to find a match with a hypothesis output (hyp1) of said first automated speech recognition engine, then to find a match with a hypothesis output (hyp2) of said second automated speech recognition engine;
setting an output to equal one of said hypothesis outputs of said automated speech recognition engines if a corresponding match was found in the steps of cross checking and incrementing; and
adopting said hypothesis output of said first automated speech recognition engine if neither of the steps of cross checking and incrementing produce a match. - View Dependent Claims (2, 3)
-
-
4. A speech recognition system with independent ordering of hypothesis alternatives, comprising:
-
an input for receiving utterances;
a first automated speech recognition (ASR1) engine connected to receive said utterances from the input and to output a respective first hypothesis (hyp1);
a second automated speech recognition (ASR2) engine connected to receive said utterances from the input and to output a respective second hypothesis (hyp2);
a first alternative matrix and lookup connected to receive said hyp1 as a first lookup table index and for providing a first alternatives vector;
a second alternative matrix and lookup connected to receive said hyp2 as a second lookup table index and for providing a second alternatives vector; and
an alternatives search processor for comparing said hyp1 to an ordered sequence of individual values in said second alternatives vector, and for comparing said hyp2 to a matching ordered sequence of individual values in said first alternatives vector, and for expressing a preference of hyp1 or hyp2 as a correct machine interpretation of said utterances. - View Dependent Claims (5)
-
-
6. A method for using exactly two speech recognition engines in automated speech recognition applications, the method comprising:
-
deriving normalized first and second Bayesian probability matrices from respective first and second confusion matrices that are associated with first and second automated speech recognition (ASR) engines, wherein said normalized Bayesian probability matrices each include a set of vectors for each possible hypothesis output of said automated speech recognition engines;
lookup indexing each Bayesian matrix corresponding to each ASR to select a respective pair of column vectors;
merging said respective pair of column vectors into a single probability vector;
sorting said single probability vector according to the Bayesian probabilities to produce a sorted Bayesian probability vector;
program looping through said sorted Bayesian probability vector to find a match with either of said ASR engines; and
setting an output to equal one of said hypothesis outputs of said ASR engines if a corresponding match was found in the step program looping; and
otherwise, adopting said hypothesis output of said first automated speech recognition engine and setting said output to that if the step program looping does not produce a match.
-
-
7. A speech recognition system, comprising:
-
a first and a second automated speech recognition (ASR) engine connected in parallel to receive a speech input and each having respective independent hypotheses outputs;
a normalizing engine providing for normalized first and second Bayesian probability matrices from respective first and second confusion matrices that are associated with first and second automated speech recognition (ASR) engines, wherein said normalized Bayesian probability matrices each include a set of vectors for each possible hypothesis output of said automated speech recognition engines;
a lookup processor providing for an indexing of each Bayesian matrix corresponding to each ASR to select a respective pair of column vectors;
a combining mechanism providing for a merger of said respective pair of column vectors into a single probability vector;
a sorting mechanism for ordering said single probability vector according to Bayesian probabilities to produce a sorted Bayesian probability vector;
a computer mechanism for program looping through said sorted Bayesian probability vector to find a match with either of said ASR engine hypotheses outputs; and
an output for signaling one of said hypothesis outputs of said ASR engines as a most probable if a corresponding match was found in the step program looping; and
a default mechanism for adopting the hypothesis output of the first ASR engine and setting the output to that if said computer mechanism for program looping does not produce a match.
-
Specification