Male acoustic model adaptation based on language-independent female speech data

US 8,756,062 B2
Filed: 12/10/2010
Issued: 06/17/2014
Est. Priority Date: 12/10/2010
Status: Active Grant

First Claim

Patent Images

1. A method of generating proxy acoustic models for use in automatic speech recognition, comprising the steps of:

(a) training acoustic models from speech received via microphone from male speakers of a first language using an automatic speech recognition (ASR) system comprising the microphone, memory, and a processor; and

(b) adapting the acoustic models trained in step (a) using the ASR system in response to language-independent speech data from female speakers of a second language, to generate proxy acoustic models for use during runtime of speech recognition of an utterance from a female speaker of the first language.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of generating proxy acoustic models for use in automatic speech recognition includes training acoustic models from speech received via microphone from male speakers of a first language, and adapting the acoustic models in response to language-independent speech data from female speakers of a second language, to generate proxy acoustic models for use during runtime of speech recognition of an utterance from a female speaker of the first language.

Citations

15 Claims

1. A method of generating proxy acoustic models for use in automatic speech recognition, comprising the steps of:
- (a) training acoustic models from speech received via microphone from male speakers of a first language using an automatic speech recognition (ASR) system comprising the microphone, memory, and a processor; and
  
  (b) adapting the acoustic models trained in step (a) using the ASR system in response to language-independent speech data from female speakers of a second language, to generate proxy acoustic models for use during runtime of speech recognition of an utterance from a female speaker of the first language.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the adapting step (b) is carried out before speech recognition runtime.
  - 3. The method of claim 1, wherein the adapting step (b) is carried out on the utterance from the female speaker of the first language during speech recognition runtime.
  - 4. The method of claim 3, wherein the adapting step (b) is carried out in response to an identification of at least one of a plurality of formant frequency bands in the speech data from the female speakers of the second language that corresponds to at least one formant frequency determined in the utterance from the female speaker of the first language.
  - 5. The method of claim 4, wherein the adapting step (b) is carried out by frequency warping the acoustic models trained in step (a) in response to the identification of the at least one of the plurality of formant frequency bands in the speech data from the female speakers of the second language.
  - 6. The method of claim 4, wherein the at least one formant frequency determined in the utterance from the female speaker of the first language is an average of a plurality of formant frequencies in the received utterance.
  - 7. The method of claim 6, wherein the plurality of formant frequencies in the received utterance are from at least one of a first formant, a second formant, or a third formant.
  - 8. The method of claim 6, wherein the at least one formant frequency of the determining step (c) includes a first formant, a second formant, and a third formant.

9. A method of automatic speech recognition, comprising the steps of:
- (a) receiving an utterance via a microphone from a female speaker of a first language;
  
  (b) pre-processing the utterance with an automatic speech recognition pre-processor to generate acoustic feature vectors;
  
  (c) determining at least one formant frequency of the received utterance;
  
  (d) identifying at least one of a plurality of formant frequency bands in speech data from female speakers of a second language that corresponds to the at least one formant frequency determined in step (c); and
  
  (e) adapting acoustic models trained from speech from male speakers of the first language in response to the identifying step (d), to result in proxy acoustic models for the female speaker of the first language, wherein the method is carried out using an automatic speech recognition (ASR) system comprising the microphone, memory, and a processor.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The method of claim 9, further comprising the step of (f) decoding the acoustic feature vectors generated in step (b) using a processor and the acoustic models adapted in step (e) to produce a plurality of hypotheses for the received utterance.
  - 11. The method of claim 10, further comprising the step of (g) post-processing the plurality of hypotheses to recognize one of the plurality of hypotheses as the received speech.
  - 12. The method of claim 9 wherein the at least one formant frequency of the determining step (c) is an average of a plurality of formant frequencies of the received utterance.
  - 13. The method of claim 12, wherein the plurality of formant frequencies of the received utterance includes at least one of a first formant, a second formant, or a third formant.
  - 14. The method of claim 9, wherein the at least one formant frequency of the determining step (c) includes a first formant, a second formant, and a third formant.
  - 15. The method of claim 9, wherein the adapting step (e) includes frequency warping the acoustic models to result in the proxy acoustic models.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
General Motors LLC (General Motors Company)
Original Assignee
General Motors LLC (General Motors Company)
Inventors
Talwar, Gaurav, Chengalvarayan, Rathinavelu
Primary Examiner(s)
YEN, ERIC L

Application Number

US12/965,508
Publication Number

US 20120150541A1
Time in Patent Office

1,285 Days
Field of Search

704/243, 704/256.2
US Class Current

704/256.2
CPC Class Codes

G10L 15/065 Adaptation

Male acoustic model adaptation based on language-independent female speech data

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Male acoustic model adaptation based on language-independent female speech data

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links