Robust separation of speech signals in a noisy environment

US 20070021958A1
Filed: 07/22/2005
Published: 01/25/2007
Est. Priority Date: 07/22/2005
Status: Active Grant

First Claim

Patent Images

1. A method for improving a speech signal using a voice activity detector, comprising:

receiving a first signal;

receiving a second signal;

comparing the energy level in the first signal to the energy level in the second signal;

determining that voice activity is present when the energy level of the first signal is higher then the energy level of the second signal;

generating a control signal responsive to determining that voice activity is present; and

controlling a speech enhancement process using the control signal.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for improving the quality of a speech signal extracted from a noisy acoustic environment is provided. In one approach, a signal separation process is associated with a voice activity detector. The voice activity detector is a two-channel detector, which enables a particularly robust and accurate detection of voice activity. When speech is detected, the voice activity detector generates a control signal. The control signal is used to activate, adjust, or control signal separation processes or post-processing operations to improve the quality of the resulting speech signal. In another approach, a signal separation process is provided as a learning stage and an output stage. The learning stage aggressively adjusts to current acoustic conditions, and passes coefficients to the output stage. The output stage adapts more slowly, and generates a speech-content signal and a noise dominant signal. When the learning stage becomes unstable, only the learning stage is reset, allowing the output stage to continue outputting a high quality speech signal.

Citations

23 Claims

1. A method for improving a speech signal using a voice activity detector, comprising:
- receiving a first signal;
  
  receiving a second signal;
  
  comparing the energy level in the first signal to the energy level in the second signal;
  
  determining that voice activity is present when the energy level of the first signal is higher then the energy level of the second signal;
  
  generating a control signal responsive to determining that voice activity is present; and
  
  controlling a speech enhancement process using the control signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method for detecting voice activity according to claim 1, wherein the first signal is generated by a first microphone, and the second signal is generated by a second microphone.
  - 3. The method for detecting voice activity according to claim 1, wherein the first signal is a speech-content signal generated by a signal separation process, and the second signal is a noise-dominant signal generated by the signal separation process.
  - 4. The method for detecting voice activity according to claim 1, wherein the determining step includes determining that the difference in the energy level between the first signal and the second signal exceeds a threshold value.
  - 5. The method for detecting voice activity according to claim 4, wherein the threshold value is dynamically adjusted.
  - 6. The method for detecting voice activity according to claim 1, wherein the comparing step includes comparing signal samples of about 10 ms to about 30 ms in length.
  - 7. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a signal separation process, and the signal separation process is activated responsive to the control signal.
  - 8. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a post processing operation, and the post processing operation is activated responsive to the control signal.
  - 9. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a post processing operation, and the post processing operation is deactivated responsive to the control signal.
  - 10. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a signal separation process, and a learning process for the signal separation process is activated responsive to the control signal.
  - 11. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a noise estimation process, and the noise estimation process is deactivated responsive to the control signal.
  - 12. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is an automatic gain control process, and the automatic gain control process is activated responsive to the control signal.
  - 13. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is a post processing spectral subtraction process, and the output from the post processing spectral subtraction process is scaled responsive to the control signal.
  - 14. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is an echo cancellation process, and the echo cancellation process uses a far end signal and a microphone signal as filter inputs responsive to the control signal not being present.
  - 15. The method for detecting voice activity according to claim 1, wherein the speech enhancement process is an echo cancellation process, and the echo cancellation process freezes and applies a learned filter to an incoming far end signal responsive to the control signal.

16. A signal separation process, comprising:
- receiving a first signal;
  
  receiving a second signal;
  
  comparing the first signal and the second signal to determine that voice activity is present;
  
  generating a control signal responsive to determining that voice activity is present;
  
  activating a blind signal separation process responsive to the control signal;
  
  receiving the first and second signals into the blind signal separation process; and
  
  generating a signal having speech content.
- View Dependent Claims (17, 18)
- - 17. The signal separation process according to claim 16, further including the step of deactivating the blind signal separation process when the control signal is not present.
  - 18. The signal separation process according to claim 16, wherein the blind signal separation process is an independent component analysis process.

19. A signal separation system, comprising:
- a first microphone generating a first signal;
  
  a second microphone generating a second signal;
  
  a first learning stage receiving the first signal and the second signal, and generating a set of teaching coefficients;
  
  the learning stage being configured to rapidly adapt its coefficients to current acoustic conditions;
  
  an output stage coupled to the learning stage and receiving the teaching coefficients;
  
  the output stage receiving the first signal and the second signal, and generating a speech-content signal and a noise-dominant signal; and
  
  the output stage being configured to more slowly adapt its coefficients.
- View Dependent Claims (20, 21, 22, 23)
- - 20. The signal separation system according to claim 19, further including a reset monitor that monitors the learning stage for an unstable condition, and generates a reset signal when an unstable condition is found.
  - 21. The signal separation system according to claim 20, wherein the coefficients for the learning stage are reset responsive to the reset signal, and the output stage is not reset.
  - 22. The signal separation system according to claim 20, wherein the coefficients for the learning stage are reset with a set of default coefficients responsive to the reset signal.
  - 23. The signal separation system according to claim 22, wherein the coefficients are selected from a plurality of sets of default coefficients, with each set of coefficients defined according to a different expected operating environment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
SoftMax Incorporated (Qualcomm, Inc.)
Inventors
Visser, Erik, Chan, Kwokleung, Toman, Jeremy

Granted Patent

US 7,464,029 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/226
CPC Class Codes

G10L 2021/02165   Two microphones, one receiv...

G10L 21/0272   Voice signal separating

G10L 25/78   Detection of presence or ab...

H04R 2410/07   Mechanical or electrical re...

Robust separation of speech signals in a noisy environment

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Robust separation of speech signals in a noisy environment

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links