System and method for performing speech recognition in cyclostationary noise environments

US 6,785,648 B2
Filed: 05/31/2001
Issued: 08/31/2004
Est. Priority Date: 05/31/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A system for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:

a characterization module configured to convert original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process; and

a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for performing speech recognition in cyclostationary noise environments includes a characterization module that may access original cyclostationary noise from an intended operating environment of a speech recognition device. The characterization module may then convert the original cyclostationary noise into target stationary noise which retains characteristics of the original cyclostationary noise. A conversion module may then generate a modified training database by utilizing the target stationary noise to modify an original training database that was prepared for training a recognizer in the speech recognition device. A training module may then train the recognizer with the modified training database to thereby optimize speech recognition procedures in cyclostationary noise environments.

19 Citations

View as Search Results

43 Claims

1. A system for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:
- a characterization module configured to convert original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process; and
  
  a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. The system of claim 1 wherein said speech recognition device is implemented as part of a robotic device to compensate for cyclostationary noise in said operating environment of said robotic device.
  - 3. The system of claim 1 wherein said original cyclostationary noise data is recorded, digitized, and stored in a memory device for access by said characterization module.
  - 4. The system of claim 1 wherein a Fast Fourier Transform of said characterization module converts said original cyclostationary noise data from a time domain to a frequency domain to produce a cyclostationary noise frequency-power distribution.
  - 5. The system of claim 4 wherein said cyclostationary noise frequency-power distribution includes an array file with groupings of power values that each correspond to a different cyclostationary frequency, and wherein said groupings each correspond to a different time frame.
  - 6. The system of claim 4 wherein an averaging filter accesses said cyclostationary noise frequency-power distribution, and responsively generates an average cyclostationary noise frequency-power distribution.
  - 7. The system of claim 6 wherein said averaging filter calculates an average cyclostationary power value for each frequency of said cyclostationary noise frequency-power distribution across different time frames to thereby produce said average cyclostationary noise frequency-power distribution which characterizes stationary noise characteristics of said original cyclostationary noise data.
  - 8. The system of claim 6 wherein said averaging filter performs an averaging operation according to a following formula:
    - $Average CS {Power}_{k} = \frac{1}{N} \sum_{t = 1}^{N} CS {Power}_{k} (t)$
9. The system of claim 6 wherein said characterization module accesses white noise data that has a uniform power distribution across a given frequency range.
10. The system of claim 9 wherein said Fast Fourier Transform of said characterization module converts said white noise data from said time domain to said frequency domain to produce a white noise frequency-power distribution.
11. The system of claim 10 wherein said white noise frequency-power distribution includes a series of white noise power values that each correspond to a particular frequency.
12. The system of claim 10 wherein a modulation module of said characterization module utilizes said white noise frequency-power distribution and said average cyclostationary noise frequency-power distribution to generate a target stationary noise frequency-power distribution.
13. The system of claim 12 wherein said modulation module modulates said white noise power values of said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution to thereby generate said target stationary noise frequency-power distribution.
14. The system of claim 12 wherein said modulation module generates individual target stationary power values of said target stationary noise frequency-power distribution by multiplying individual ones of said white noise power values from said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution on a frequency-by-frequency basis.
15. The system of claim 12 wherein said modulation module modulates said white noise frequency-power distribution with said average cyclostationary noise frequency-power distribution in accordance with a following formula:
16. The system of claim 12 wherein an Inverse Fast Fourier Transform accesses said target stationary noise frequency-power distribution to generate target stationary noise data by converting said target stationary noise frequency-power distribution from said frequency domain to said time domain.
17. The system of claim 16 wherein a conversion module accesses an original training database that was recorded for training said recognizer based upon an intended speech recognition vocabulary of said speech recognition system, said conversion module responsively generating a modified training database by utilizing said target stationary noise data to modify said original training database.
18. The system of claim 17 wherein said conversion module adds said target stationary noise data to said original training database to produce said modified training database that then incorporates characteristics of said original cyclostationary noise data to thereby improve performance characteristics of said speech recognition device.
19. The system of claim 17 wherein a training module accesses said modified training database to perform a speech recognition training procedure to train said recognizer.
20. The system of claim 19 wherein said speech recognition device utilizes said recognizer after said speech recognition training procedure with said modified training database has been completed to thereby optimally perform various speech recognition functions.

21. A method for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising the steps of:
- converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data with a characterization module by performing a cyclostationary noise characterization process;
  
  converting an original training database into a modified training database with a conversion module by incorporating said target stationary noise data into said original training database; and
  
  training a recognizer from said speech recognition device by utilizing said modified training database.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
- - 22. The method of claim 21 wherein said speech recognition device is implemented as part of a robotic device to compensate for cyclostationary noise in said operating environment of said robotic device.
  - 23. The method of claim 21 wherein said original cyclostationary noise data is recorded, digitized, and stored in a memory device for access by said characterization module.
  - 24. The method of claim 21 wherein a Fast Fourier Transform of said characterization module converts said original cyclostationary noise data from a time domain to a frequency domain to produce a cyclostationary noise frequency-power distribution.
  - 25. The method of claim 24 wherein said cyclostationary noise frequency-power distribution includes an array file with groupings of power values that each correspond to a different cyclostationary frequency, and wherein said groupings each correspond to a different time frame.
  - 26. The method of claim 24 wherein an averaging filter accesses said cyclostationary noise frequency-power distribution, and responsively generates an average cyclostationary noise frequency-power distribution.
  - 27. The method of claim 26 wherein said averaging filter calculates an average cyclostationary power value for each frequency of said cyclostationary noise frequency-power distribution across different time frames to thereby produce said average cyclostationary noise frequency-power distribution which characterizes stationary noise characteristics of said original cyclostationary noise data.
  - 28. The method of claim 26 wherein said averaging filter performs an averaging operation according to a following formula:
    - $Average CS {Power}_{k} = \frac{1}{N} \sum_{t = 1}^{N} CS {Power}_{k} (t)$
29. The method of claim 26 wherein said characterization module accesses white noise data that has a uniform power distribution across a given frequency range.
30. The method of claim 29 wherein said Fast Fourier Transform of said characterization module converts said white noise data from said time domain to said frequency domain to produce a white noise frequency-power distribution.
31. The method of claim 30 wherein said white noise frequency-power distribution includes a series of white noise power values that each correspond to a particular frequency.
32. The method of claim 30 wherein a modulation module of said characterization module utilizes said white noise frequency-power distribution and said average cyclostationary noise frequency-power distribution to generate a target stationary noise frequency-power distribution.
33. The method of claim 32 wherein said modulation module modulates said white noise power values of said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution to thereby generate said target stationary noise frequency-power distribution.
34. The method of claim 32 wherein said modulation module generates individual target stationary power values of said target stationary noise frequency-power distribution by multiplying individual ones of said white noise power values from said white noise frequency-power distribution with corresponding ones of said cyclostationary power values from said average cyclostationary noise frequency-power distribution on a frequency-by-frequency basis.
35. The method of claim 32 wherein said modulation module modulates said white noise frequency-power distribution with said average cyclostationary noise frequency-power distribution in accordance with a following formula:
36. The method of claim 32 wherein an Inverse Fast Fourier Transform accesses said target stationary noise frequency-power distribution to generate target stationary noise data by converting said target stationary noise frequency-power distribution from said frequency domain to said time domain.
37. The method of claim 36 wherein a conversion module accesses an original training database that was recorded for training said recognizer based upon an intended speech recognition vocabulary of said speech recognition system, said conversion module responsively generating a modified training database by utilizing said target stationary noise data to modify said original training database.
38. The method of claim 37 wherein said conversion module adds said target stationary noise data to said original training database to produce said modified training database that then incorporates characteristics of said original cyclostationary noise data to thereby improve performance characteristics of said speech recognition device.
39. The method of claim 37 wherein a training module accesses said modified training database to perform a speech recognition training procedure to train said recognizer.
40. The method of claim 39 wherein said speech recognition device utilizes said recognizer after said speech recognition training procedure with said modified training database has been completed to thereby optimally perform various speech recognition functions.

41. An apparatus for performing a cyclostationary noise equalization procedure in a speech recognition device, comprising:
- means for converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data by performing a cyclostationary noise characterization process;
  
  means for converting an original training database into a modified training database by incorporating said target stationary noise data into said original training database; and
  
  means for training a recognizer from said speech recognition device by utilizing said modified training database.

42. A computer-readable medium comprising program instructions for performing a cyclostationary noise equalization procedure in a speech recognition device by performing the steps of:
- converting original cyclostationary noise data from an operating environment of said speech recognition device into target stationary noise data with a characterization module by performing a cyclostationary noise characterization process;
  
  converting an original training database into a modified training database with a conversion module by incorporating said target stationary noise data into said original training database; and
  
  training a recognizer from said speech recognition device by utilizing said modified training database.

43. A system for performing a noise equalization procedure in a speech recognition device, comprising:
- a characterization module configured to convert original noise data from an operating environment of said speech recognition device into target noise data; and
  
  a conversion module coupled to said characterization module for converting an original training database into a modified training database by incorporating said target noise data into said original training database, said modified training database then being utilized to train a recognizer from said speech recognition device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.), Sony Electronics Inc. (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.), Sony Electronics Inc. (Sony Group Corp.)
Inventors
Abrego, Gustavo Hernandez, Menendez-Pidal, Xavier
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/872,196
Publication Number

US 20020188444A1
Time in Patent Office

1,188 Days
Field of Search

704/233, 704/234, 704/226, 704/237, 704/224, 704/270, 704/275
US Class Current

704/233
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

System and method for performing speech recognition in cyclostationary noise environments

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

19 Citations

43 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for performing speech recognition in cyclostationary noise environments

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

19 Citations

43 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links