Noise reduction systems and methods for voice applications

US 7,519,186 B2
Filed: 04/25/2003
Issued: 04/14/2009
Est. Priority Date: 04/25/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A method comprising:

providing a computing device having a housing and an array of microphones comprising two or more microphones, wherein at least one of the microphones is mounted inside the housing and at least one of the microphones is mounted outside the housing; and

using the microphone array, training the device to recognize noise from known locations by equipping the device with a filter system that can filter noise from the known locations, wherein said training is accomplished using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of the known locations and in which said speech is captured by said two or more microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the computing device and said noise is captured by said two or more microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments reduce noise within a particular environment, while isolating and capturing speech in a manner that allows operation within an otherwise noisy environment. In one embodiment, an array of one or more microphones is used to selectively eliminate noise emanating from known, generally fixed locations, and pass signals from a pre-specified region or regions with reduced distortion.

Citations

110 Claims

1. A method comprising:
- providing a computing device having a housing and an array of microphones comprising two or more microphones, wherein at least one of the microphones is mounted inside the housing and at least one of the microphones is mounted outside the housing; and
  
  using the microphone array, training the device to recognize noise from known locations by equipping the device with a filter system that can filter noise from the known locations, wherein said training is accomplished using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of the known locations and in which said speech is captured by said two or more microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the computing device and said noise is captured by said two or more microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the device comprises a keyboard.
  - 3. The method of claim 1, wherein the device comprises a game controller.
  - 4. The method of claim 1, wherein the device comprises a laptop computer.
  - 5. The method of claim 1, wherein at least some of the known locations are fixed relative to the microphone array.
  - 6. The method of claim 1, wherein at least some of the known locations are located on the device itself.
  - 7. The method of claim 1, wherein at least some of the known locations are not located on the device itself.
  - 8. The method of claim 1, wherein:
    - at least some of the known locations are located on the device itself; and
      
      at least some of the known locations are not located on the device itself.
  - 9. The method of claim 1, wherein the microphone array does not comprise a headset-mounted microphone.
  - 10. The method of claim 1, wherein the microphone array comprises one or more headset-mounted microphones.

11. A method comprising:
- providing a computing device having a housing and an array of microphones comprising two or more microphones, wherein at least one of the microphones is mounted inside the housing and at least one of the microphones is mounted outside the housing; and
  
  using the microphone array, training the device to recognize noise from particular known locations and sources by equipping the device with a filter system that can filter noise from the particular known locations and sources, wherein said training is accomplished using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of the known locations and in which said speech is captured by said array of microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the computing device and said noise is captured by said array of microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 12. The method of claim 11, wherein the device comprises a keyboard.
  - 13. The method of claim 11, wherein the device comprises a keyboard, and at least some of the sources comprise keys on the keyboard.
  - 14. The method of claim 11, wherein the device comprises a game controller.
  - 15. The method of claim 11, wherein the device comprises a game controller, and at least some of the sources comprise buttons on the controller.
  - 16. The method of claim 11, wherein the device comprises a laptop computer.
  - 17. The method of claim 11, wherein the device comprises a laptop computer, and at least some of the sources comprise keys on the laptop computer.
  - 18. The method of claim 11, wherein at least some of the known sources are fixed relative to the microphone array.
  - 19. The method of claim 11, wherein at least some of the known sources are fixed relative to the microphone array, and at least one source comprises a button.
  - 20. The method of claim 11, wherein at least some of the known sources are located on the device itself.
  - 21. The method of claim 11, wherein at least some of the known sources are not located on the device itself.
  - 22. The method of claim 11, wherein:
    - at least some of the known sources are located on the device itself; and
      
      at least some of the known sources are not located on the device itself.
  - 23. The method of claim 11, wherein the microphone array does not comprise a headset-mounted microphone.
  - 24. The method of claim 11, wherein the microphone array comprises one or more headset-mounted microphones.
  - 25. The method of claim 11, wherein said training comprises equipping the device with filters associated with individual sources of noise.

26. A method comprising:
- providing a game controller having an array of microphones comprising one or more microphones;
  
  using the microphone array, training the game controller to recognize audio signals from particular known locations and sources by equipping the game controller with a filter system that can (a) filter noise from particular known locations and sources, and (b) pass signals associated with desired speech from particular locations, wherein said training is accomplished using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of the known locations and in which said speech is captured by said array of microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the game controller and said noise is captured by said array of microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 27. The method of claim 26, wherein at least some of the known locations are fixed relative to the microphone array.
  - 28. The method of claim 26, wherein at least some of the known locations are located on the game controller itself.
  - 29. The method of claim 26, wherein at least some of the known locations are not located on the game controller itself.
  - 30. The method of claim 26, wherein:
    - at least some of the known locations are located on the game controller itself; and
      
      at least some of the known locations are not located on the game controller itself.
  - 31. The method of claim 26, wherein the noise that the filter system is designed to filter comprises noise associated with button clicks on the game controller.
  - 32. The method of claim 26, wherein the noise that the filter system is designed to filter comprises undesired speech that emanates from particular locations relative to the game controller.
  - 33. The method of claim 26, wherein said training comprises equipping the game controller with at least some filters that are associated with individual sources of noise.
  - 34. The method of claim 26, wherein the microphone array does not comprise a headset-mounted microphone.
  - 35. The method of claim 26, wherein the microphone array comprises one or more headset-mounted microphones.

36. A method comprising:
- providing a user-engagable input device comprising a housing that supports an array of microphones, at least one of the microphones being mounted inside of the housing, wherein the user-engagable input device comprises a game controller;
  
  using the microphone array, training the device to recognize noise from known locations, wherein said training is accomplished using multiple training phases that are initiated by a user, including a noise capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the user-engagable input device and audio signals associated with the noise are captured by the array of microphones, and a speech-capturing phase in which the user speaks from one or more of the known locations and audio signals associated with the speech are captured by the array of microphones;
  
  correlation processing the audio signals associated with the noise and constructing one or more filter components as a function of the processed audio signals;
  
  correlation processing the audio signals associated with the speech and constructing one or more filter components as a function of the processed audio speech signals; and
  
  incorporating a filter system comprising the filter components into one or more user-engagable input devices.
- View Dependent Claims (37, 38, 39)
- - 37. The method of claim 36, wherein said filter system comprises one or more spatial filters computed as generalized Wiener filters having the form:
    - wopt=(Rss+β
      
      Rnn)−
      
      1 (E{ds}),where Rss is the correlation matrix for a desired speech signal, Rnn is the correlation matrix for the noise component, β
      
      is a weighting parameter for the noise component, and E{ds} is the expected value of the product of the desired signal d and the actual signal s that is received by a microphone.
  - 38. The method of claim 36, wherein at least some sources and locations of noise are known in advance.
  - 39. The method of claim 36, wherein at least some locations of the speech are known in advance.

40. A method comprising:
- providing a computing device having a housing and an array of microphones comprising two or more microphones, the computing device comprising a trained filter system configured to recognize noise from particular known locations relative to the computing device, wherein at least one of the microphones is mounted inside of the housing and at least one of the microphones is mounted outside of the housing, wherein the trained filter system is trained using multiple training phases that are initiated by a user including a speech-capturing phase in which the user speaks from one or more of the particular known locations and in which said speech is captured by said two or more microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the computing device, wherein the training enables the filter system to create a desired speech profile and a desired noise profile;
  
  capturing audio signals using the microphone array;
  
  filtering noise from the captured audio signals using the trained filter system.
- View Dependent Claims (41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
- - 41. The method of claim 40, wherein the device comprises a keyboard.
  - 42. The method of claim 40, wherein the device comprises a game controller.
  - 43. The method of claim 40, wherein the device comprises a laptop computer.
  - 44. The method of claim 40, wherein at least some of the known locations are fixed relative to the microphone array.
  - 45. The method of claim 40, wherein at least some of the known locations are located on the device itself.
  - 46. The method of claim 40, wherein at least some of the known locations are not located on the device itself.
  - 47. The method of claim 40, wherein:
    - at least some of the known locations are located on the device itself; and
      
      at least some of the known locations are not located on the device itself.
  - 48. The method of claim 40 further comprising after said filtering, attempting to remove noise from a filtered signal as a function of a ratio of a signal energy before filtering to a signal energy after filtering.
  - 49. The method of claim 40, wherein the microphone array does not comprise a headset-mounted microphone.
  - 50. The method of claim 40, wherein the microphone array comprises one or more headset-mounted microphones.

51. A method comprising:
- providing a computing device having an array of microphones comprising one or more microphones, the computing device comprising a trained filter system configured to recognize noise from particular known locations and sources, wherein the trained filter system is trained using multiple training phases initiated by a user including a speech-capturing phase in which the user speaks from one or more of the particular known locations and in which said speech is captured by said one or more microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the computing device and said noise is captured by said one or more microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile;
  
  coupling the computing device in communication with another computing device via a network;
  
  capturing audio signals using the microphone array; and
  
  filtering noise from the captured audio signals using the trained filter system such that the filtered noise is not transmitted to the other computing device.
- View Dependent Claims (52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65)
- - 52. The method of claim 51, wherein the device comprises a keyboard.
  - 53. The method of claim 51, wherein the device comprises a keyboard, and at least some of the sources comprise keys on the keyboard.
  - 54. The method of claim 51, wherein the device comprises a game controller.
  - 55. The method of claim 51, wherein the device comprises a game controller, and at least some of the sources comprise buttons on the controller.
  - 56. The method of claim 51, wherein the device comprises a laptop computer.
  - 57. The method of claim 51, wherein the device comprises a laptop computer, and at least some of the sources comprise keys on the laptop computer.
  - 58. The method of claim 51, wherein at least some of the known sources are fixed relative to the microphone array.
  - 59. The method of claim 51, wherein at least some of the known sources are fixed relative to the microphone array, and at least one source comprises a button.
  - 60. The method of claim 51, wherein at least some of the known sources are located on the device itself.
  - 61. The method of claim 51, wherein at least some of the known sources are not located on the device itself.
  - 62. The method of claim 51, wherein:
    - at least some of the known sources are located on the device itself; and
      
      at least some of the known sources are not located on the device itself.
  - 63. The method of claim 51 further comprising after said filtering, attempting to remove noise from a filtered signal as a function of a ratio of a signal energy before filtering to a signal energy after filtering.
  - 64. The method of claim 51, wherein the microphone array does not comprise a headset-mounted microphone.
  - 65. The method of claim 51, wherein said filter system comprises one or more filters that are associated with individual sources of noise, and wherein said filtering comprises detecting whether an individual noise source has been engaged by a user and responsively selecting a filter associated with the engaged noise source to filter noise produced by the engaged noise source.

66. A method comprising:
- providing a game controller having an array of microphones comprising one or more microphones, the game controller comprising a trained filter system configured to recognize audio signals from particular known locations and sources, wherein the filter system has been trained using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of the particular known locations and in which said speech is captured by said array of microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the game controller and said noise is captured by said array of microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile;
  
  coupling the game controller in communication with another game controller via a network;
  
  capturing audio signals using the microphone array;
  
  filtering the captured signals using the trained filter system effective to (a) filter noise from particular locations and sources, and (b) pass signals associated with desired speech from particular locations, wherein the filtered noise is not communicated to the other game controller.
- View Dependent Claims (67, 68, 69, 70, 71, 72, 73, 74)
- - 67. The method of claim 66, wherein at least some of the known locations are fixed relative to the microphone array.
  - 68. The method of claim 66, wherein at least some of the known locations are located on the game controller itself.
  - 69. The method of claim 66, wherein at least some of the known locations are not located on the game controller itself.
  - 70. The method of claim 66, wherein:
    - at least some of the known locations are located on the game controller itself; and
      
      at least some of the known locations are not located on the game controller itself.
  - 71. The method of claim 66, wherein the noise that the filter system is designed to filter comprises noise associated with button clicks on the game controller.
  - 72. The method of claim 66, wherein the noise that the filter system is designed to filter comprises undesired speech that emanates from particular locations relative to the game controller.
  - 73. The method of claim 66 further comprising after said filtering, attempting to remove noise from a filtered signal as a function of a ratio of a signal energy before filtering to a signal energy after filtering.
  - 74. The method of claim 66, wherein the microphone array does not comprise a headset-mounted microphone.

75. A method comprising:
- providing a user-engagable input device comprising a housing that supports an array of microphones, at least one of the microphones being mounted inside of the housing, and wherein at least one of the microphones is mounted outside of the housing;
  
  capturing audio signals associated with the environment in which the user-engagable input device is used, wherein the audio signals can comprise both noise and desired speech;
  
  filtering the captured audio signals using a trained filter system that is configured to recognize noise and desired speech, wherein the filter system is trained using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more known locations and in which said speech is captured by said array of microphones, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the user-engagable input device and said noise is captured by said array of microphones, wherein the training enables the filter system to create a desired speech profile and a desired noise profile, the filter system comprising multiple filters computed as generalized Wiener filters having the form;
  
  wopt=(Rss+β
  
  Rnn)−
  
  1 (E{ds}),where Rss is the correlation matrix for a desired speech signal, Rnn is the correlation matrix for the noise component, β
  
  is a weighting parameter for the noise component, and E{ds} is the expected value of the product of the desired signal d and the actual signal s that is received by a microphone.
- View Dependent Claims (76, 77, 78, 79, 80)
- - 76. The method of claim 75, wherein the user-engagable input device comprises a game controller.
  - 77. The method of claim 75, wherein at least some sources and locations of noise are known in advance.
  - 78. The method of claim 75, wherein at least some locations of the desired speech are known in advance.
  - 79. The method of claim 75, wherein the filter system is configured to adaptively filter audio signals.
  - 80. The method of claim 75 further comprising after said filtering, attempting to remove noise from a filtered signal as a function of a ratio of a signal energy before filtering to a signal energy after filtering.

81. A system comprising:
- a housing;
  
  one or more user input mechanisms supported by the housing;
  
  a processor;
  
  a computer-readable media;
  
  a microphone array at least some of which supported by the housing and comprising two or more microphones, wherein at least one of the microphones is mounted inside the housing and at least one of the microphones is mounted outside the housing;
  
  a noise reduction component comprising a filter system embodied on the computer-readable media, the filter system being trained to recognize noise from particular known locations; and
  
  the noise reduction component being configured to cause the processor to use the trained filter system to filter noise, from said known locations, from audio signals captured by the microphone array, wherein the trained filter system is trained using multiple training phases that are initiated by a user, including a speech-capturing phase in which the user speaks from one or more of said known locations and in which said speech is captured by the microphone array, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the housing and said noise is captured by the microphone array, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (82, 83, 84, 85, 86, 87, 88)
- - 82. The system of claim 81, wherein the filter system is trained to recognize noise from locations that are fixed relative to the microphone array.
  - 83. The system of claim 81, wherein the filter system is trained to recognize noise from locations that are fixed on the housing.
  - 84. The system of claim 81, wherein the filter system is trained to recognize noise from locations that are not fixed relative to the microphone array.
  - 85. The system of claim 81, wherein the filter system is trained to recognize noise from locations that are both fixed relative to the microphone array, and not fixed relative to the microphone array.
  - 86. The system of claim 81, wherein the processor is supported within the housing.
  - 87. The system of claim 81, wherein the computer readable media is supported within the housing.
  - 88. The system of claim 81, wherein the processor and the computer-readable media are supported within the housing.

89. A system comprising:
- a housing;
  
  one or more user input mechanisms supported by the housing;
  
  a processor;
  
  a computer-readable media;
  
  a microphone array comprising one or more microphones;
  
  a noise reduction component comprising a filter system embodied on the computer-readable media, the filter system being trained to recognize noise from particular known locations and sources; and
  
  the noise reduction component being configured to cause the processor to use the trained filter system to filter noise, from said known locations and sources, from audio signals captured by the microphone array, wherein the system is configured to communicate with another system, and wherein the filtered noise is not transmitted to the other system, wherein the trained filter system is trained using multiple phases that are initiated by a user including a speech-capturing phase in which the user speaks from one or more of the known locations and said speech is captured by the microphone array, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the housing and said noise is captured by the microphone array, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (90, 91, 92, 93, 94)
- - 90. The system of claim 89, wherein at least some of the sources are fixed relative to the microphone array.
  - 91. The system of claim 89, wherein at least some of the sources are located on the housing.
  - 92. The system of claim 89, wherein at least some of the sources are not located on the housing.
  - 93. The system of claim 89, wherein at least some of the sources are not located on the housing, and at least one source that is not on the housing comprises speech.
  - 94. The system of claim 89, wherein at least some of the sources are located on the housing, and at least some of the sources are not located on the housing.

95. A system comprising:
- a housing;
  
  one or more user input mechanisms supported by the housing;
  
  a processor;
  
  a computer-readable media;
  
  a microphone array comprising two or more microphones, at least one of the microphones being mounted within the housing and at least one of the microphones being mounted outside the housing;
  
  a noise reduction component comprising a filter system embodied on the computer-readable media, the filter system being trained to recognize audio signals from particular known sources and locations; and
  
  the noise reduction component being configured to cause the processor to use the trained filter system to (a) filter noise, from said known sources and locations, from audio signals captured by the microphone array, and (b) pass signals associated with desired speech from particular locations, wherein the trained filter system is trained using multiple training phases that are initiated by a user including a speech-capturing phase in which a user speaks from one or more of the known locations, and a noise-training phase in which the user produces button clicking noise by physically manipulating one or more buttons on the housing and said noise is captured by the microphone array, wherein said training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (96, 97, 98, 99, 100, 101)
- - 96. The system of claim 95, wherein the filter system is trained to recognize noise from locations that are fixed relative to the microphone array.
  - 97. The system of claim 95, wherein the filter system is trained to recognize noise from locations that are fixed on the housing.
  - 98. The system of claim 95, wherein the filter system is trained to recognize noise from locations that are not fixed relative to the microphone array.
  - 99. The system of claim 95, wherein the filter system is trained to recognize noise from locations that are not fixed relative to the microphone array, and at least some of the noise from locations that are not fixed relative to the microphone array comprises speech.
  - 100. The system of claim 95, wherein the filter system is trained to recognize noise that emanates from one or more of the user input mechanisms.
  - 101. The system of claim 95, wherein the filter system is trained to recognize noise from sources mounted on and contained within the housing.

102. A noise reduction component comprising:
- a transform component configured to transform audio samples from a microphone array of a game controller from the time domain into the frequency domain;
  
  a filter system associated with the transform component and configured to filter frequency samples produced by the transform component, the filter system comprising multiple filters each of which being associated with a frequency bin, individual filters comprising a generalized Wiener filter having the form;
  
  wopt=(Rss+β
  
  Rnn)−
  
  1 (E{ds}),where Rss is the correlation matrix for a desired speech signal, Rnn is the correlation matrix for a noise component, β
  
  is a weighting parameter for the noise component, and E{ds} is the expected value of the product of the desired signal d and the actual signal s that is received by a microphone, wherein the filter system is trained using multiple training phases that are initiated by a user including a speech-capturing phase in which the user speaks from one or more known locations and in which said speech is captured by the microphone array, and a noise-capturing phase in which the user produces button clicking noise by physically manipulating one or more buttons on the game controller and said noise is captured by the microphone array, wherein the training enables the filter system to create a desired speech profile and a desired noise profile.
- View Dependent Claims (103, 104, 105, 106, 107, 108, 109, 110)
- - 103. The noise reduction component of claim 102, wherein the transform component comprises a Modulated Complex Lapped Transform (MCLT).
  - 104. The noise reduction component of claim 102, wherein at least some sources and locations of noise are known in advance.
  - 105. The noise reduction component of claim 102, wherein at least some locations of the desired speech are known in advance.
  - 106. The noise reduction component of claim 102, wherein at least some sources and locations of noise are known in advance, and at least some locations of the desired speech are known in advance.
  - 107. The noise reduction component of claim 102, wherein the filter system is configured to adaptively filter audio signals.
  - 108. A device embodying the noise reduction component of claim 102.
  - 109. A game controller embodying the noise reduction component of claim 102.
  - 110. The noise reduction component of claim 102 further comprising an energy ratio component configured to receive a filtered output from the filter system and process the filtered output to attempt to further remove noise from the signal as a function of the energy of the samples before filtering by the filter system and the energy of the samples after filtering by the filter system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Florencio, Dinei, Varma, Ankur
Primary Examiner(s)
Mei; Xu
Assistant Examiner(s)
SUTHERS, DOUGLAS JOHN

Application Number

US10/423,287
Publication Number

US 20040213419A1
Time in Patent Office

2,181 Days
Field of Search

381/110, 381 941- 943, 381 56- 57, 381/92, 381 7112- 7113, 381/94.7, 704/275, 704/270, 704/272
US Class Current

381/94.7
CPC Class Codes

A63F 2300/1081   Input via voice recognition

G10L 2021/02087   the noise being separate sp...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0208   Noise filtering

H04R 3/005   for combining the signals o...

Noise reduction systems and methods for voice applications

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

110 Claims

Specification

Solutions

Use Cases

Quick Links

Noise reduction systems and methods for voice applications

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

110 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links