System and method for hybrid voice recognition
First Claim
Patent Images
1. A voice recognition system, comprising:
- an acoustic processor configured to extract speech parameters from a speech segment;
a plurality of different voice recognition engines coupled to the acoustic processor, wherein each voice recognition engine is configured to produce a plurality of hypotheses and a plurality of scores, each score represents a distance from the speech segment to a corresponding hypothesis, and at least one of the voice recognition engines is a nametag engine; and
a decision logic configured to;
receive the plurality of scores from the plurality of different voice recognition engines, compute a combined score for a subset of the plurality of voice recognition engines, wherein the subset excludes the nametag engine, determine a best score for the subset and a best score for the nametag engine, compare the best score for the subset against the best score for the nametag engine.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.
36 Citations
10 Claims
-
1. A voice recognition system, comprising:
-
an acoustic processor configured to extract speech parameters from a speech segment;
a plurality of different voice recognition engines coupled to the acoustic processor, wherein each voice recognition engine is configured to produce a plurality of hypotheses and a plurality of scores, each score represents a distance from the speech segment to a corresponding hypothesis, and at least one of the voice recognition engines is a nametag engine; and
a decision logic configured to;
receive the plurality of scores from the plurality of different voice recognition engines, compute a combined score for a subset of the plurality of voice recognition engines, wherein the subset excludes the nametag engine, determine a best score for the subset and a best score for the nametag engine, compare the best score for the subset against the best score for the nametag engine. - View Dependent Claims (2, 3, 4, 5, 6, 7)
arranging each of the plurality of scores in an array;
comparing corresponding elements in at least two arrays to create a new array with the lowest scores of the at least two arrays; and
multiplying each array, except the at least two arrays used to create the new array, by a weighting coefficient in order to generate a plurality of weighted arrays, and to combine the plurality of weighted arrays to yield a combined score.
-
-
3. The voice recognition system of claim 1, wherein the plurality of different voice recognition engines includes a speaker-independent voice recognition engine.
-
4. The voice recognition system of claim 3 wherein the plurality of different voice recognition engines includes a speaker-dependent voice recognition engine.
-
5. The voice recognition system of claim 1, wherein the plurality of different voice recognition engines includes a speaker-dependent voice recognition engine.
-
6. The voice recognition system of claim 1, wherein the speaker-independent voice recognition engine is a Hidden Markov Model voice recognition engine.
-
7. The voice recognition system of claim 1, wherein the speaker-independent voice recognition engine is a Dynamic Time Warping voice recognition engine.
-
8. A method of voice recognition, comprising:
-
extracting speech parameters from a speech segment;
producing a hypothesis and a corresponding score for each different voice recognition engine of a plurality of different voice recognition engines based on the extracted speech parameters, wherein the score represents a distance from the speech segment to the hypothesis;
computing a minimum score for each of the plurality of different voice recognition engines;
computing a combined score by weighting the minimum scores of the plurality of voice recognition engines; and
using the combined score to select an analysis method to be performed on a hypothesis corresponding to the smallest score.
-
-
9. An apparatus to be used for voice recognition, comprising:
-
means for extracting speech parameters from a speech segment;
means for producing a hypothesis and a corresponding score for each different voice recognition engine of a plurality of different voice recognition engines based on the extracted speech parameters, wherein the score represents a distance from the speech segment to the hypothesis;
means for computing a minimum score for each of the plurality of different voice recognition engines;
means for computing a combined score by weighting the minimum scores of the plurality of voice recognition engine hypothesis; and
means for using the combined score to select an analysis method to be performed on a hypothesis corresponding to the smallest score.
-
-
10. A computer readable media embodying a method of voice recognition, the method comprising:
-
extracting speech parameters from a speech segment;
producing a hypothesis and a corresponding score for each different voice recognition engine of a plurality of different voice recognition engines based on the extracted speech parameters, wherein the score represents a distance from the speech segment to the hypothesis;
computing a minimum score for each of the plurality of different voice recognition engines;
computing a combined score by weighting the minimum scores of the plurality of voice recognition engines; and
using the combined score to select an analysis method to be performed on a hypothesis corresponding to the smallest score.
-
Specification