System and method for performing speech recognition in cyclostationary noise environments
First Claim
1. A system for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:
- a characterization module configured to convert original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process; and
a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for performing speech recognition in cyclostationary noise environments includes a characterization module that may access original cyclostationary noise from an intended operating environment of a speech recognition device. The characterization module may then convert the original cyclostationary noise into target stationary noise which retains characteristics of the original cyclostationary noise. A conversion module may then generate a modified training database by utilizing the target stationary noise to modify an original training database that was prepared for training a recognizer in the speech recognition device. A training module may then train the recognizer with the modified training database to thereby optimize speech recognition procedures in cyclostationary noise environments.
19 Citations
43 Claims
-
1. A system for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:
-
a characterization module configured to convert original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process; and
a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
where said “
k”
represents a frequency, said “
t”
represents a time frame, said “
N”
represents a total number of time frames, said CS Power is a cyclostationary noise power value from said cyclostationary noise frequency-power distribution, and said Average CS Power is an average cyclostationary power value from said average cyclostationary noise frequency-power distribution.
-
-
9. The system of claim 6 wherein said characterization module accesses white noise data that has a uniform power distribution across a given frequency range.
-
10. The system of claim 9 wherein said Fast Fourier Transform of said characterization module converts said white noise data from said time domain to said frequency domain to produce a white noise frequency-power distribution.
-
11. The system of claim 10 wherein said white noise frequency-power distribution includes a series of white noise power values that each correspond to a particular frequency.
-
12. The system of claim 10 wherein a modulation module of said characterization module utilizes said white noise frequency-power distribution and said average cyclostationary noise frequency-power distribution to generate a target stationary noise frequency-power distribution.
-
13. The system of claim 12 wherein said modulation module modulates said white noise power values of said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution to thereby generate said target stationary noise frequency-power distribution.
-
14. The system of claim 12 wherein said modulation module generates individual target stationary power values of said target stationary noise frequency-power distribution by multiplying individual ones of said white noise power values from said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution on a frequency-by-frequency basis.
-
15. The system of claim 12 wherein said modulation module modulates said white noise frequency-power distribution with said average cyclostationary noise frequency-power distribution in accordance with a following formula:
-
16. The system of claim 12 wherein an Inverse Fast Fourier Transform accesses said target stationary noise frequency-power distribution to generate target stationary noise data by converting said target stationary noise frequency-power distribution from said frequency domain to said time domain.
-
17. The system of claim 16 wherein a conversion module accesses an original training database that was recorded for training said recognizer based upon an intended speech recognition vocabulary of said speech recognition system, said conversion module responsively generating a modified training database by utilizing said target stationary noise data to modify said original training database.
-
18. The system of claim 17 wherein said conversion module adds said target stationary noise data to said original training database to produce said modified training database that then incorporates characteristics of said original cyclostationary noise data to thereby improve performance characteristics of said speech recognition device.
-
19. The system of claim 17 wherein a training module accesses said modified training database to perform a speech recognition training procedure to train said recognizer.
-
20. The system of claim 19 wherein said speech recognition device utilizes said recognizer after said speech recognition training procedure with said modified training database has been completed to thereby optimally perform various speech recognition functions.
-
21. A method for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising the steps of:
-
converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data with a characterization module by performing a cyclostationary noise characterization process;
converting an original training database into a modified training database with a conversion module by incorporating said target stationary noise data into said original training database; and
training a recognizer from said speech recognition device by utilizing said modified training database. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
where said “
k”
represents a frequency, said “
t”
represents a time frame, said “
N”
represents a total number of time frames, said CS Power is a cyclostationary noise power value from said cyclostationary noise frequency-power distribution, and said Average CS Power is an average cyclostationary power value from said average cyclostationary noise frequency-power distribution.
-
-
29. The method of claim 26 wherein said characterization module accesses white noise data that has a uniform power distribution across a given frequency range.
-
30. The method of claim 29 wherein said Fast Fourier Transform of said characterization module converts said white noise data from said time domain to said frequency domain to produce a white noise frequency-power distribution.
-
31. The method of claim 30 wherein said white noise frequency-power distribution includes a series of white noise power values that each correspond to a particular frequency.
-
32. The method of claim 30 wherein a modulation module of said characterization module utilizes said white noise frequency-power distribution and said average cyclostationary noise frequency-power distribution to generate a target stationary noise frequency-power distribution.
-
33. The method of claim 32 wherein said modulation module modulates said white noise power values of said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution to thereby generate said target stationary noise frequency-power distribution.
-
34. The method of claim 32 wherein said modulation module generates individual target stationary power values of said target stationary noise frequency-power distribution by multiplying individual ones of said white noise power values from said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution on a frequency-by-frequency basis.
-
35. The method of claim 32 wherein said modulation module modulates said white noise frequency-power distribution with said average cyclostationary noise frequency-power distribution in accordance with a following formula:
-
36. The method of claim 32 wherein an Inverse Fast Fourier Transform accesses said target stationary noise frequency-power distribution to generate target stationary noise data by converting said target stationary noise frequency-power distribution from said frequency domain to said time domain.
-
37. The method of claim 36 wherein a conversion module accesses an original training database that was recorded for training said recognizer based upon an intended speech recognition vocabulary of said speech recognition system, said conversion module responsively generating a modified training database by utilizing said target stationary noise data to modify said original training database.
-
38. The method of claim 37 wherein said conversion module adds said target stationary noise data to said original training database to produce said modified training database that then incorporates characteristics of said original cyclostationary noise data to thereby improve performance characteristics of said speech recognition device.
-
39. The method of claim 37 wherein a training module accesses said modified training database to perform a speech recognition training procedure to train said recognizer.
-
40. The method of claim 39 wherein said speech recognition device utilizes said recognizer after said speech recognition training procedure with said modified training database has been completed to thereby optimally perform various speech recognition functions.
-
41. An apparatus for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:
-
means for converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process;
means for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database; and
means for training a recognizer from said speech recognition device by utilizing said modified training database.
-
-
42. A computer-readable medium comprising program instructions for performing a cyclostationary noise equalization procedure in a speech recognition device by performing the steps of:
-
converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data with a characterization module by performing a cyclostationary noise characterization process;
converting an original training database into a modified training database with a conversion module by incorporating said target stationary noise data into said original training database; and
training a recognizer from said speech recognition device by utilizing said modified training database.
-
-
43. A system for performing a noise equalization procedure in a speech recognition device, comprising:
-
a characterization module configured to convert original noise data from an operating environment of said speech recognition device into target noise data; and
a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device.
-
Specification