Hybrid lexicon for speech recognition

US 7,945,445 B1
Filed: 07/04/2001
Issued: 05/17/2011
Est. Priority Date: 07/14/2000
Status: Active Grant

First Claim

Patent Images

1. A method of speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the method comprising:

dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein for the first partial vocabulary, only at least one of easily interchangeable words and important words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using a phoneme-based model;

training and transcribing the words of the first partial vocabulary using a whole word model wherein each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word;

transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using the phoneme-based model, wherein the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon; and

storing the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and apparatus for speech recognition based on a hidden Markov model are disclosed. A disclosed method of speech recognition is based on a hidden Markov model in which words to be recognized are modeled as chains of states and trained using predefined speech data material. Known vocabulary is divided into first and second partial vocabularies where the first partial vocabulary is trained and transcribed using a whole word model and the second partial vocabulary is trained and transcribed using a phoneme-based model in order to obtain a mixed hidden Markov model. The transcriptions from the two models are stored in a single pronunciation lexicon and the mixed hidden Markov model stored in a singe search space. Apparatus are disclosed that also employ a hidden Markov model.

Citations

8 Claims

1. A method of speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the method comprising:
- dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein for the first partial vocabulary, only at least one of easily interchangeable words and important words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using a phoneme-based model;
  
  training and transcribing the words of the first partial vocabulary using a whole word model wherein each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word;
  
  transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using the phoneme-based model, wherein the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon; and
  
  storing the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.
- View Dependent Claims (2, 3, 4)
- - 2. The method as defined in claim 1, wherein the first partial vocabulary comprises numerals and/or control instruction words.
  - 3. The method according to claim 1, wherein the step of dividing is performed during a speech recognition training.
  - 4. The method according to claim 3, wherein said second partial vocabulary can be expanded to include new words not contained in said speech recognition training.

5. A speech recognizer for carrying out speech recognition based on a hidden Markov model in which a word to be recognized is modeled as a chain of states and trained using predefined speech data material, the speech recognizer comprising:
- a model memory for storing the hidden Markov model;
  
  a vocabulary memory having a first and second memory area, wherein both memory areas respectively store a first and second partial vocabulary, wherein the first partial vocabulary only comprises numerals and/or control instruction words which have been identified from a known vocabulary and wherein remaining words from said known vocabulary are only assigned to the second partial vocabulary and trained using the phoneme-based model;
  
  a training processing unit comprising a first and second sub-processing units, wherein the training processing unit has an input connected to the vocabulary memory and is configured to implement the hidden Markov model; and
  
  a lexicon memory containing a pronunciation lexicon that is connected to an output of the training processing unit,wherein the first sub-processing unit communicates with the first memory area of the vocabulary memory to train and transcribe the words of the first partial vocabulary stored therein by implementing a whole word model where each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word and the sections of each word of the first partial vocabulary are transcribed with a word identifier and an index,and wherein the second sub-processing unit communicates with the second memory area of the vocabulary memory to transcribe the second partial vocabulary by implementing a phoneme-based model, where the words are modeled by means of states that correspond to phonemes or parts of phonemes, to obtain a corresponding mixed hidden Markov model,and wherein the lexicon memory stores the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions in a single pronunciation lexicon, and stores the mixed hidden Markov model in a single search space, wherein the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.

6. A method of speech recognition based on a hidden Markov model in which a compound word to be recognized is partially modeled as a chain of states and trained using predefined speech data material, the method comprising:
- dividing a known vocabulary into a first partial vocabulary of words and a second partial vocabulary of other words, wherein both the first and second partial vocabulary are related to the compound word, wherein from the known vocabulary only numerals and/or control instruction words are identified and assigned to the first partial vocabulary and wherein the other words of said known vocabulary are only assigned to the second partial vocabulary and trained using the phoneme-based model;
  
  training and transcribing the first partial vocabulary using a whole word model for one part of the compound word where each word of the first partial vocabulary is modeled by a chain of states by dividing each word into a plurality of sections which only apply to the respective word;
  
  transcribing the sections of each word of the first partial vocabulary with a word identifier and an index and transcribing the second partial vocabulary using phoneme-based modeling for another part of the compound word, where the words are modeled by means of states that correspond to phonemes or parts of phonemes, in order to obtain a corresponding mixed hidden Markov model by storing the first partial vocabulary in the form of word identifiers with indices and the second partial vocabulary in the form of phonetic transcriptions that are acquired from the two models in a single pronunciation lexicon;
  
  storing the mixed hidden Markov model in a single search space where the states of the phoneme-based model correspond to phonemes or parts of phonemes and are used in a plurality of words and the states of the whole word model only apply to the respective word.
- View Dependent Claims (7, 8)
- - 7. The method according to claim 6, wherein the step of dividing is performed during a speech recognition training.
  - 8. The method according to claim 7, wherein said second partial vocabulary can be expanded to include new words not contained in said speech recognition training.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
SVOX AG (Microsoft Corporation)
Inventors
Niemoeller, Meinrad, Wilhelm, Ralph, Marschall, Erwin
Primary Examiner(s)
Yen; Eric

Application Number

US10/333,114
Time in Patent Office

3,604 Days
Field of Search

704/254, 704/251, 704/255
US Class Current

704/251
CPC Class Codes

G10L 15/063 Training

G10L 15/144 Training of HMMs

Hybrid lexicon for speech recognition

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Hybrid lexicon for speech recognition

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links