AUTOMATED SPEECH RECOGNITION USING NORMALIZED IN-VEHICLE SPEECH
First Claim
1. A method of speech recognition comprising the steps of:
- (a) receiving speech in a vehicle;
(b) extracting acoustic data from the received speech; and
(c) applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.
14 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition method includes the steps of receiving speech in a vehicle, extracting acoustic data from the received speech, and applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. The speech recognition method may also include one or more of the following steps: pre-processing the normalized acoustic data to extract acoustic feature vectors; decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models built according to a plurality of Lombard levels of a Lombard speech corpus covering a plurality of vehicles; calculating the Lombard level of vehicle noise; and/or selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step.
58 Citations
20 Claims
-
1. A method of speech recognition comprising the steps of:
-
(a) receiving speech in a vehicle; (b) extracting acoustic data from the received speech; and (c) applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of speech recognition for a plurality of vehicles, comprising the steps of:
-
(a) developing a corpus of Lombard speech data; (b) building a plurality of global acoustic models based on the corpus of Lombard speech data; (c) receiving speech in a vehicle using an integrated vehicle microphone; (d) generating an inverse impulse response function for each of the plurality of vehicles; (e) extracting acoustic data from the received speech; and (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of speech recognition for a plurality of vehicles, comprising the steps of:
-
(a) developing a corpus of Lombard speech including a plurality of Lombard levels; (b) building a plurality of global acoustic models based on the corpus of Lombard speech data, wherein each model of the plurality of global acoustic models is distinguished from the other models based on a Lombard level of the plurality of Lombard levels; (c) receiving speech in a vehicle using an integrated vehicle microphone; (d) generating an inverse impulse response function for each of the plurality of vehicles, wherein the inverse impulse response function is determined by at least one of first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function, or determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel wherein the IVM is designated as an input and the MRP microphone is designated as an output; (e) extracting acoustic data from the received speech; (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data; (g) pre-processing the normalized acoustic data to extract acoustic feature vectors; (h) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level of a Lombard speech corpus covering a plurality of vehicles; (i) calculating the Lombard level of vehicle noise; and (j) selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (h).
-
Specification