Speech recognition system and method using a hidden markov model adapted to recognize a number of words and trained to recognize a greater number of phonetically dissimilar words.
First Claim
1. A speech recognition system for discrete words, comprising:
- interface means for receiving incoming voice signals;
processing means, operatively coupled to said interface means, for processing said incoming voice signals;
program means, responsive to said processed voice signals from said processing means, for performing speech recognition on said processed voice signals, said program means using a single Hidden Markov Model (HMM), said HMM nominally being adapted to recognise N different words, characterised in that said HMM is trained to recognise M different words, where M>
N, said M words being phonetically dissimilar from one another.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition system for discrete words uses a single Hidden Markov Model (HMM), which is nominally adapted to recognise N different isolated words, but which is trained to recognise M different words, where M>N. This is achieved by providing M sets of audio recordings, each set comprising multiple recordings of a respective one of said M words being spoken. Only N different labels are assigned to the M sets of audio recordings, so that at least one of the N labels has two or more sets of audio recordings assigned thereto. These two or more sets of audio recordings correspond to phonetically dissimilar words. The HMM is then trained by inputting each set of audio recordings and its assigned label. The HMM can effectively compensate for the phonetic variations between the different words assigned the same label, thereby avoiding the need to utilise a larger model (i.e., to use M labels).
42 Citations
13 Claims
-
1. A speech recognition system for discrete words, comprising:
-
interface means for receiving incoming voice signals; processing means, operatively coupled to said interface means, for processing said incoming voice signals; program means, responsive to said processed voice signals from said processing means, for performing speech recognition on said processed voice signals, said program means using a single Hidden Markov Model (HMM), said HMM nominally being adapted to recognise N different words, characterised in that said HMM is trained to recognise M different words, where M>
N, said M words being phonetically dissimilar from one another. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer implemented method for training a speech recognition system for discrete words, comprising the steps of:
-
providing a single Hidden Markov Model (HMM); providing M sets of audio recordings, each set including multiple recordings of a respective one of said M words being spoken; assigning N labels to the M sets of audio recordings, such that at least one of the N labels has two or more sets of audio recordings assigned thereto, said two or more sets of audio recordings corresponding to phonetically dissimilar words; inputting each of said sets of audio recordings and assigned N labels into said speech recognition system; determining a training path through said HMM of said system for each of said inputted audio recordings; and storing said determined training path for each of said audio recordings together with said N label assigned to each of said audio recordings, whereby speech recognition of a word can be performed by determining said training path most likely to output said word to be recognised and then equating said word to be recognised with said N label associated with said most likely determined path. - View Dependent Claims (12, 13)
-
Specification