Hybrid lexicon for speech recognition
First Claim
1. A method of speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the method comprising:
- dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein for the first partial vocabulary, only at least one of easily interchangeable words and important words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using a phoneme-based model;
training and transcribing the words of the first partial vocabulary using a whole word model wherein each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word;
transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using the phoneme-based model, wherein the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon; and
storing the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.
8 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for speech recognition based on a hidden Markov model are disclosed. A disclosed method of speech recognition is based on a hidden Markov model in which words to be recognized are modeled as chains of states and trained using predefined speech data material. Known vocabulary is divided into first and second partial vocabularies where the first partial vocabulary is trained and transcribed using a whole word model and the second partial vocabulary is trained and transcribed using a phoneme-based model in order to obtain a mixed hidden Markov model. The transcriptions from the two models are stored in a single pronunciation lexicon and the mixed hidden Markov model stored in a singe search space. Apparatus are disclosed that also employ a hidden Markov model.
-
Citations
8 Claims
-
1. A method of speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the method comprising:
-
dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein for the first partial vocabulary, only at least one of easily interchangeable words and important words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using a phoneme-based model; training and transcribing the words of the first partial vocabulary using a whole word model wherein each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word; transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using the phoneme-based model, wherein the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon; and
storing the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word. - View Dependent Claims (2, 3, 4)
-
-
5. A speech recognizer for carrying out speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the speech recognizer comprising:
-
a model memory for storing the hidden Markov model; a vocabulary memory having a first and second memory area, wherein both memory areas respectively store a first and second partial vocabulary, wherein the first partial vocabulary only comprises numerals and/or control instruction words which have been identified from a known vocabulary and wherein remaining words from said known vocabulary are only assigned to the second partial vocabulary and trained using the phoneme-based model; a training processing unit comprising a first and second sub-processing units, wherein the training processing unit has an input connected to the vocabulary memory and is configured to implement the hidden Markov model; and a lexicon memory containing a pronunciation lexicon that is connected to an output of the training processing unit, wherein the first sub-processing unit communicates with the first memory area of the vocabulary memory to train and transcribe the words of the first partial vocabulary stored therein by implementing a whole word model where each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word and the sections of each word of the first partial vocabulary are transcribed with a word identifier and an index, and wherein the second sub-processing unit communicates with the second memory area of the vocabulary memory to transcribe the second partial vocabulary by implementing a phoneme-based model, where the words are modeled by means of states that correspond to phonemes or parts of phonemes, to obtain a corresponding mixed hidden Markov model, and wherein the lexicon memory stores the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon, and stores the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.
-
-
6. A method of speech recognition based on a hidden Markov model in which a compound word to be recognized is partially modeled as a chain of states and trained using predefined speech data material, the method comprising:
-
dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein both the first and second partial vocabulary are related to the compound word, wherein from the known vocabulary only numerals and/or control instruction words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using the phoneme-based model; training and transcribing the first partial vocabulary using a whole word model for one part of the compound word where each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word; transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using phoneme-based modeling for another part of the compound word, where the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions that are acquired from the two models in a single pronunciation lexicon; storing the mixed hidden Markov model in a single search space where the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word. - View Dependent Claims (7, 8)
-
Specification