SPEECH DIALECT CLASSIFICATION FOR AUTOMATIC SPEECH RECOGNITION

US 20120109649A1
Filed: 11/01/2010
Published: 05/03/2012
Est. Priority Date: 11/01/2010
Status: Abandoned Application

First Claim

Patent Images

1. A method of automatic speech recognition, comprising:

(a) receiving speech via a microphone;

(b) pre-processing the received speech to generate acoustic feature vectors;

(c) classifying dialect of the received speech;

(d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c);

(e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and

(f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Automatic speech recognition including receiving speech via a microphone, pre-processing the received speech to generate acoustic feature vectors, classifying dialect of the received speech, selecting at least one of an acoustic model or a lexicon specific to the classified dialect, decoding the acoustic feature vectors using a processor and at least one of the selected dialect-specific acoustic model or selected lexicon to produce a plurality of hypotheses for the received speech, and post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.

79 Citations

View as Search Results

18 Claims

1. A method of automatic speech recognition, comprising:
- (a) receiving speech via a microphone;
  
  (b) pre-processing the received speech to generate acoustic feature vectors;
  
  (c) classifying dialect of the received speech;
  
  (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c);
  
  (e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and
  
  (f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1 wherein step (c) is carried out using Gaussian mixture models trained on text independent speech data from a plurality of different speakers of a plurality of different dialects.
  - 3. The method of claim 1 wherein step (c) is carried out by:
    - i) accessing an expected lexicon including a plurality of words having pronunciations corresponding to different dialects;
      
      ii) decoding the generated acoustic feature vectors using the expected lexicon and a universal acoustic model to produce a plurality of hypotheses for the received speech; and
      
      iii) post-processing the plurality of hypotheses to identify a hypothesis of the plurality of hypotheses as the received speech, wherein the dialect of the identified hypothesis is the classified dialect.

4. A method of automatic speech recognition, comprising:
- (a) receiving speech via a microphone;
  
  (b) pre-processing the received speech to generate acoustic feature vectors;
  
  (c) classifying dialect of the received speech using Gaussian mixture models trained on text independent speech data from a plurality of different speakers of a plurality of different dialects;
  
  (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c);
  
  (e) decoding the acoustic feature vectors generated in step (b) using a processor and at least one of the dialect-specific acoustic model or lexicon selected in step (d) to produce a plurality of hypotheses for the received speech; and
  
  (f) post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.
- View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
- - 5. The method of claim 4 wherein said plurality of different dialects of step (c) includes at least two of the following North American English dialects:
    - Western, Upper Midwestern, Midland, Mountain Southern, Coastal Southern, Southern Central, Great Lakes, N.Y., New England, Asian-American, Latino, or African-American.
  - 6. The method of claim 4 wherein the classifying step (c) includes generating an N-best list of dialect hypotheses.
  - 7. The method of claim 6 wherein the dialect hypotheses are compared to a present dialect region in which the method is being carried out and, if the present dialect region matches one of the dialect hypotheses, then the dialect of the present dialect region is selected.
  - 8. The method of claim 7 wherein if there is no match between the present dialect region and any of the dialect hypotheses, then a first-best dialect hypothesis of the dialect hypotheses is selected.
  - 9. The method of claim 4 wherein the dialect-specific acoustic model is generated before speech recognition runtime using the same text independent speech data used to generate the Gaussian mixture models.
  - 10. The method of claim 4 further comprising storing in a vehicle telematics unit memory, a plurality of different lexicons from which the dialect-specific lexicon of step (d) is selected.
  - 11. The method of claim 4 wherein the classified dialect is used to invoke text-to-speech prompts corresponding to the classified dialect.

12. A method of automatic speech recognition, comprising:
- (a) receiving speech via a microphone;
  
  (b) pre-processing the received speech to generate acoustic feature vectors;
  
  (c) classifying dialect of the received speech by;
  
  i) accessing an expected lexicon including a plurality of words having pronunciations corresponding to different dialects;
  
  ii) decoding the acoustic feature vectors generated in step (b) using the expected lexicon and a universal acoustic model to produce a plurality of hypotheses for the received speech; and
  
  iii) post-processing the plurality of hypotheses to identify a hypothesis of the plurality of hypotheses as the received speech, wherein the dialect of the identified hypothesis is the classified dialect;
  
  (d) selecting at least one of an acoustic model or a lexicon specific to the dialect classified in step (c);
  
  (e) receiving additional speech;
  
  (f) pre-processing the received additional speech to generate additional acoustic feature vectors; and
  
  (g) decoding the acoustic feature vectors generated in step (f) using at least one of the dialect-specific acoustic model or lexicon selected in step (d).
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The method of claim 12 wherein said plurality of different dialects of step (c) includes at least two of the following North American English dialects:
    - Western, Upper Midwestern, Midland, Mountain Southern, Coastal Southern, Southern Central, Great Lakes, N.Y., New England, Asian-American, Latino, or African-American.
  - 14. The method of claim 12 wherein the dialect-specific lexicon includes sets of pronunciations of an expected lexicon.
  - 15. The method of claim 14 wherein the expected lexicon is a main menu lexicon.
  - 16. The method of claim 12 wherein the dialect hypotheses are compared to a present dialect region in which the method is being carried out and, if the present dialect region matches one of the dialect hypotheses, then the dialect of the present dialect region is selected.
  - 17. The method of claim 16 wherein if there is no match between the present dialect region and any of the dialect hypotheses, then a first-best dialect hypothesis of the dialect hypotheses is selected.
  - 18. The method of claim 12 wherein the classified dialect is used to invoke text-to-speech prompts corresponding to the classified dialect.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
General Motors LLC (General Motors Company)
Original Assignee
General Motors LLC (General Motors Company)
Inventors
Talwar, Gaurav, Chengalvarayan, Rathinavelu

Application Number

US12/916,962
Publication Number

US 20120109649A1
Time in Patent Office

Days
Field of Search
US Class Current

704/236
CPC Class Codes

G10L 15/005 Language recognition

G10L 15/08 Speech classification or se...

SPEECH DIALECT CLASSIFICATION FOR AUTOMATIC SPEECH RECOGNITION

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

79 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

SPEECH DIALECT CLASSIFICATION FOR AUTOMATIC SPEECH RECOGNITION

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

79 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links