Speech recognition apparatus and methods
First Claim
1. A method of word recognition in continuous speech comprising the steps of:
- deriving a speech signal;
initially performing a first analysis of the speech signal by a Markov or other technique not involving neural net techniques to identify boundaries between different words and to separate the entire speech signal into discrete words;
providing a first signal in accordance with the first analysis;
comparing the first signal from the first analysis with a stored vocabulary of a multiplicity of words to provide a second signal that is a first indication of the words spoken;
supplying the entire first signal provided by the first analysis to means for performing a second analysis different from the first analysis and utilizing neural net techniques on the entire words without any prior restriction of word candidates by the first analysis to produce a third signal representative of the words spoken; and
providing an output signal representative of the words spoken from at least the third signal produced by the second analysis.
3 Assignments
0 Petitions
Accused Products
Abstract
Speech recognition is carried out by performing a first analysis of a speech signal using a Hidden Semi Markov Model and an asymmetric time warping algorithm. A second analysis is also performed using Multi-Layer Perceptron techniques in conjunction with a neural net. The first analysis is used by the second to identify word boundaries. Where the first analysis provides an indication of the word spoken above a certain level of confidence, an output representative of the word spoken may be generated solely in response to the first analysis, the second analysis being utilized when the level of confidence falls. The output controls a function of an aircraft and provides feedback to the speaker of the words spoken.
-
Citations
9 Claims
-
1. A method of word recognition in continuous speech comprising the steps of:
- deriving a speech signal;
initially performing a first analysis of the speech signal by a Markov or other technique not involving neural net techniques to identify boundaries between different words and to separate the entire speech signal into discrete words;
providing a first signal in accordance with the first analysis;
comparing the first signal from the first analysis with a stored vocabulary of a multiplicity of words to provide a second signal that is a first indication of the words spoken;
supplying the entire first signal provided by the first analysis to means for performing a second analysis different from the first analysis and utilizing neural net techniques on the entire words without any prior restriction of word candidates by the first analysis to produce a third signal representative of the words spoken; and
providing an output signal representative of the words spoken from at least the third signal produced by the second analysis. - View Dependent Claims (2, 4, 5, 6)
- deriving a speech signal;
-
3. A method according to claim 3, wherein the first analysis is performed using an asymmetric dynamic time warping algorithm.
-
7. Speech recognition apparatus for recognizing words in continuous speech comprising:
- store means containing speech information about a vocabulary of words that can be recognized;
means for deriving a speech signal;
first analysis means for performing a first analysis of the entire speech signal by a Markov or other technique not involving neural net techniques, said first analysis identifying boundaries between all the different words in said continuous speech and providing a first signal in accordance therewith;
means for comparing the first signal provided by the first analysis with the stored vocabulary to provide a second signal that is a first indication of the words spoken;
second analysis means operative subsequent to the performance of said first analysis for performing a second analysis of the speech signal;
means for supplying the entire first signal provided by said first analysis means to said second analysis means, said second analysis means utilizing neural net techniques and word boundary identification from said first analysis on the entire words without any prior restriction of word candidates by the first analysis;
means for providing from the second analysis a second indication of the words spoken; and
means for providing an output signal representative of the words spoken in response to at least the second indication. - View Dependent Claims (8, 9)
- store means containing speech information about a vocabulary of words that can be recognized;
Specification