AUTOMATED SPEECH RECOGNITION USING NORMALIZED IN-VEHICLE SPEECH

US 20080004875A1
Filed: 06/29/2006
Published: 01/03/2008
Est. Priority Date: 06/29/2006
Status: Active Grant

First Claim

Patent Images

1. A method of speech recognition comprising the steps of:

(a) receiving speech in a vehicle;

(b) extracting acoustic data from the received speech; and

(c) applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.

View all claims

14 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method includes the steps of receiving speech in a vehicle, extracting acoustic data from the received speech, and applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. The speech recognition method may also include one or more of the following steps: pre-processing the normalized acoustic data to extract acoustic feature vectors; decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models built according to a plurality of Lombard levels of a Lombard speech corpus covering a plurality of vehicles; calculating the Lombard level of vehicle noise; and/or selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step.

58 Citations

View as Search Results

20 Claims

1. A method of speech recognition comprising the steps of:
- (a) receiving speech in a vehicle;
  
  (b) extracting acoustic data from the received speech; and
  
  (c) applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, further comprising the steps of:
    - (d) pre-processing the normalized acoustic data to extract normalized acoustic feature vectors; and
      
      (e) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level.
  - 3. The method of claim 2, further comprising the steps of:
    - (f) calculating the Lombard level of vehicle noise; and
      
      (g) selecting the global acoustic model of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (e).
  - 4. The method of claim 2, wherein the Lombard speech corpus is developed in a sound-controlled environment using a plurality of speakers, a plurality of different levels of noises, and a plurality of different utterances.
  - 5. The method of claim 4, wherein the global acoustic models are built after extracting acoustic features of speech from the Lombard speech corpus without noise reduction.
  - 6. The method of claim 4, wherein the global acoustic models are built by selecting a Lombard level, gathering speech data corresponding to the selected Lombard level from the Lombard speech corpus, selecting one or more vehicles that exhibit noise at the selected Lombard level, mixing the gathered speech data with acoustic data from the selected one or more vehicles, and extracting acoustic features of speech from the Lombard speech corpus with noise reduction.
  - 7. The method of claim 2, wherein the decoding step is performed in-vehicle.
  - 8. The method of claim 2, wherein the decoding step is performed in a remote server.
  - 9. The method of claim 1, wherein the inverse impulse response function is determined by first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function.
  - 10. The method of claim 1, wherein the inverse impulse response function is determined by determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel, wherein the IVM is designated as an input and the MRP microphone is designated as an output.

11. A method of speech recognition for a plurality of vehicles, comprising the steps of:
- (a) developing a corpus of Lombard speech data;
  
  (b) building a plurality of global acoustic models based on the corpus of Lombard speech data;
  
  (c) receiving speech in a vehicle using an integrated vehicle microphone;
  
  (d) generating an inverse impulse response function for each of the plurality of vehicles;
  
  (e) extracting acoustic data from the received speech; and
  
  (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The method of claim 11, wherein the corpus of Lombard speech data includes a plurality of Lombard levels, and wherein each model of the plurality of global acoustic models is distinguished from the other models based on a Lombard level of the plurality of Lombard levels.
  - 13. The method of claim 11, wherein the inverse impulse response function is determined by first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function.
  - 14. The method of claim 11, wherein the inverse impulse response function is determined by determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel, wherein the IVM is designated as an input and the MRP microphone is designated as an output.
  - 15. The method of claim 11, wherein the Lombard speech corpus is developed in a sound-controlled environment using a plurality of speakers, a plurality of different levels of noises, and a plurality of different utterances.
  - 16. The method of claim 15, wherein the global acoustic models are built after extracting acoustic features of speech from the Lombard speech corpus without noise reduction.
  - 17. The method of claim 16, wherein the global acoustic models are built by selecting a Lombard level, gathering speech data corresponding to the selected Lombard level from the Lombard speech corpus, selecting one or more vehicles that exhibit noise at the selected Lombard level, mixing the gathered speech data with acoustic data from the selected one or more vehicles, and extracting acoustic features of speech from the Lombard speech corpus with noise reduction.
  - 18. The method of claim 11, further comprising the steps of:
    - (g) pre-processing the normalized acoustic data to extract acoustic feature vectors;
      
      (h) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level of a Lombard speech corpus covering a plurality of vehicles;
      
      (i) calculating the Lombard level of vehicle noise; and
      
      (j) selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (h).
  - 19. The method of claim 18 wherein the decoding step is performed in a remote server.

20. A method of speech recognition for a plurality of vehicles, comprising the steps of:
- (a) developing a corpus of Lombard speech including a plurality of Lombard levels;
  
  (b) building a plurality of global acoustic models based on the corpus of Lombard speech data, wherein each model of the plurality of global acoustic models is distinguished from the other models based on a Lombard level of the plurality of Lombard levels;
  
  (c) receiving speech in a vehicle using an integrated vehicle microphone;
  
  (d) generating an inverse impulse response function for each of the plurality of vehicles, wherein the inverse impulse response function is determined by at least one of first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function, or determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel wherein the IVM is designated as an input and the MRP microphone is designated as an output;
  
  (e) extracting acoustic data from the received speech;
  
  (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data;
  
  (g) pre-processing the normalized acoustic data to extract acoustic feature vectors;
  
  (h) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level of a Lombard speech corpus covering a plurality of vehicles;
  
  (i) calculating the Lombard level of vehicle noise; and
  
  (j) selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (h).

Specification

Resources

Litigation Campaign Assessment

Current Assignee
General Motors LLC (General Motors Company)
Original Assignee
General Motors Corporation (Motors Liquidation Co. GUC Trust)
Inventors
Chengalvarayan, Rathinavelu, Pennock, Scott M.

Granted Patent

US 7,676,363 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/234
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

AUTOMATED SPEECH RECOGNITION USING NORMALIZED IN-VEHICLE SPEECH

First Claim

14 Assignments

0 Petitions

Accused Products

Abstract

58 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

AUTOMATED SPEECH RECOGNITION USING NORMALIZED IN-VEHICLE SPEECH

First Claim

14 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

58 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links