SPEECH DIALECT CLASSIFICATION FOR AUTOMATIC SPEECH RECOGNITION
First Claim
Patent Images
1. A method of automatic speech recognition, comprising:
- (a) receiving speech via a microphone;
(b) pre-processing the received speech to generate acoustic feature vectors;
(c) classifying dialect of the received speech;
(d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c);
(e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and
(f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.
3 Assignments
0 Petitions
Accused Products
Abstract
Automatic speech recognition including receiving speech via a microphone, pre-processing the received speech to generate acoustic feature vectors, classifying dialect of the received speech, selecting at least one of an acoustic model or a lexicon specific to the classified dialect, decoding the acoustic feature vectors using a processor and at least one of the selected dialect-specific acoustic model or selected lexicon to produce a plurality of hypotheses for the received speech, and post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.
79 Citations
18 Claims
-
1. A method of automatic speech recognition, comprising:
-
(a) receiving speech via a microphone; (b) pre-processing the received speech to generate acoustic feature vectors; (c) classifying dialect of the received speech; (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c); (e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and (f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech. - View Dependent Claims (2, 3)
-
-
4. A method of automatic speech recognition, comprising:
-
(a) receiving speech via a microphone; (b) pre-processing the received speech to generate acoustic feature vectors; (c) classifying dialect of the received speech using Gaussian mixture models trained on text independent speech data from a plurality of different speakers of a plurality of different dialects; (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c); (e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and (f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of automatic speech recognition, comprising:
-
(a) receiving speech via a microphone; (b) pre-processing the received speech to generate acoustic feature vectors; (c) classifying dialect of the received speech by; i) accessing an expected lexicon including a plurality of words having pronunciations corresponding to different dialects; ii) decoding the acoustic feature vectors generated in step (b) using the expected lexicon and a universal acoustic model to produce a plurality of hypotheses for the received speech; and iii) post-processing the plurality of hypotheses to identify a hypothesis of the plurality of hypotheses as the received speech, wherein the dialect of the identified hypothesis is the classified dialect; (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c); (e) receiving additional speech; (f) pre-processing the received additional speech to generate additional acoustic feature vectors; and (g) decoding the acoustic feature vectors generated in step (f) using at least one of the dialect-specific acoustic model or lexicon selected in step (d). - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
Specification