SPEECH RECOGNITION BASED ON PRONUNCIATION MODELING
First Claim
Patent Images
1. A method comprising:
- approximating transcribed speech using a phonemic transcription dataset associated with a speaker, to yield a language model, where the phonemic transcription dataset is based on a pronunciation model of the speaker;
incorporating, into the language model, pronunciation probabilities associated with respective unique labels for each different pronunciation of a word, wherein the respective unique label for a most frequent word indicates a special status in the language model; and
after incorporating the pronunciation probabilities into the language model, recognizing an utterance using the language model.
5 Assignments
0 Petitions
Accused Products
Abstract
A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.
318 Citations
20 Claims
-
1. A method comprising:
-
approximating transcribed speech using a phonemic transcription dataset associated with a speaker, to yield a language model, where the phonemic transcription dataset is based on a pronunciation model of the speaker; incorporating, into the language model, pronunciation probabilities associated with respective unique labels for each different pronunciation of a word, wherein the respective unique label for a most frequent word indicates a special status in the language model; and after incorporating the pronunciation probabilities into the language model, recognizing an utterance using the language model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
a processor; and a computer-readable storage medium storing instructions which, when executed on the processor, cause the processor to perform a method comprising; approximating transcribed speech using a phonemic transcription dataset associated with a speaker, to yield a language model, where the phonemic transcription dataset is based on a pronunciation model of the speaker; incorporating, into the language model, pronunciation probabilities associated with respective unique labels for each different pronunciation of a word, wherein the respective unique label for a most frequent word indicates a special status in the language model; and after incorporating the pronunciation probabilities into the language model, recognizing an utterance using the language model. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-readable storage medium storing instructions which, when executed on a processor, cause the processor to perform a method comprising:
-
approximating transcribed speech using a phonemic transcription dataset associated with a speaker, to yield a language model, where the phonemic transcription dataset is based on a pronunciation model of the speaker; incorporating, into the language model, pronunciation probabilities associated with respective unique labels for each different pronunciation of a word, wherein the respective unique label for a most frequent word indicates a special status in the language model; and after incorporating the pronunciation probabilities into the language model, recognizing an utterance using the language model. - View Dependent Claims (18, 19, 20)
-
Specification