Acoustic Voice Activity Detection (AVAD) for Electronic Systems

US 20100128894A1
Filed: 10/26/2009
Published: 05/27/2010
Est. Priority Date: 05/25/2007
Status: Active Grant

First Claim

Patent Images

1. An acoustic voice activity detection system comprising:

a first virtual microphone comprising a first combination of a first signal and a second signal, wherein the first signal is received from a first physical microphone and the second signal is received from a second physical microphone;

a filter, wherein the filter is formed by generating a first quantity by applying a calibration to at least one of the first signal and the second signal, generating a second quantity by applying a delay to the first signal, and forming the filter as a ratio of the first quantity to the second quantity; and

a second virtual microphone formed by applying the filter to the first signal to generate a first intermediate signal and summing the first intermediate signal and the second signal, wherein acoustic voice activity of a speaker is determined to be present when an energy ratio of energies of the first virtual microphone and the second virtual microphone is greater than a threshold value.

View all claims

19 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.

Citations

42 Claims

1. An acoustic voice activity detection system comprising:
- a first virtual microphone comprising a first combination of a first signal and a second signal, wherein the first signal is received from a first physical microphone and the second signal is received from a second physical microphone;
  
  a filter, wherein the filter is formed by generating a first quantity by applying a calibration to at least one of the first signal and the second signal, generating a second quantity by applying a delay to the first signal, and forming the filter as a ratio of the first quantity to the second quantity; and
  
  a second virtual microphone formed by applying the filter to the first signal to generate a first intermediate signal and summing the first intermediate signal and the second signal, wherein acoustic voice activity of a speaker is determined to be present when an energy ratio of energies of the first virtual microphone and the second virtual microphone is greater than a threshold value.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The system of claim 1, wherein the first virtual microphone and the second virtual microphone have approximately similar responses to noise and approximately dissimilar responses to speech.
  - 3. The system of claim 1, wherein a calibration is applied to the second signal, wherein the calibration compensates a second response of the second physical microphone so that the second response is equivalent to a first response of the first physical microphone.
  - 4. The system of claim 1, wherein the delay is applied to the first intermediate signal, wherein the delay is proportional to a time difference between arrival of the speech at the second physical microphone and arrival of the speech at the first physical microphone.
  - 5. The system of claim 1, wherein the first virtual microphone is formed by applying the filter to the second signal.
  - 6. The system of claim 5, wherein the first virtual microphone is formed by applying the calibration to the second signal.
  - 7. The system of claim 6, wherein the first virtual microphone is formed by applying the delay to the first signal.
  - 8. The system of claim 7, wherein the first virtual microphone is formed by subtracting the second signal from the first signal.
  - 9. The system of claim 1, wherein the filter is an adaptive filter.
  - 10. The system of claim 1, wherein the filter is adapted to minimize a second virtual microphone output when only speech is being received by the first physical microphone and the second physical microphone.
  - 11. The system of claim 1, wherein coefficients of the filter are generated during a period when only speech is being received by the first physical microphone and the second physical microphone.
  - 12. The system of claim 1, wherein the energy ratio comprises an energy ratio for a frequency band.
  - 13. The system of claim 1, wherein the energy ratio comprises an energy ratio for a frequency subband.

14. A device comprising:
- a first physical microphone generating a first signal;
  
  a second physical microphone generating a second signal; and
  
  a processing component coupled to the first physical microphone and the second physical microphone, the processing component forming a first virtual microphone, the processing component forming a filter that describes a relationship for speech between the first physical microphone and the second physical microphone, the processing component forming a second virtual microphone by applying the filter to the first signal to generate a first intermediate signal, and summing the first intermediate signal and the second signal, the processing component detecting acoustic voice activity of a speaker when an energy ratio of energies of the first virtual microphone and the second virtual microphone is greater than a threshold value.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41)
- - 15. The device of claim 14, comprising applying a calibration to at least one of the first signal and the second signal.
  - 16. The device of claim 15, wherein the calibration compensates a second response of the second physical microphone so that the second response is equivalent to a first response of the first physical microphone.
  - 17. The device of claim 15, comprising applying a delay to the first intermediate signal.
  - 18. The device of claim 17, wherein the delay is proportional to a time difference between arrival of the speech at the second physical microphone and arrival of the speech at the first physical microphone.
  - 19. The device of claim 18, wherein the forming of the first virtual microphone comprises applying the filter to the second signal.
  - 20. The device of claim 19, wherein the forming of the first virtual microphone comprises applying the calibration to the second signal.
  - 21. The device of claim 20, wherein the forming of the first virtual microphone comprises applying the delay to the first signal.
  - 22. The device of claim 21, wherein the forming of the first virtual microphone by the combining comprises subtracting the second signal from the first signal.
  - 23. The device of claim 22, wherein the filter is an adaptive filter.
  - 24. The device of claim 23, comprising adapting the filter to minimize a second virtual microphone output when only speech is being received by the first physical microphone and the second physical microphone.
  - 25. The device of claim 23, wherein the adapting comprises applying a least-mean squares process.
  - 26. The device of claim 23, comprising generating coefficients of the filter during a period when only speech is being received by the first physical microphone and the second physical microphone.
  - 27. The device of claim 23, wherein the forming of the filter comprises:
    - generating a first quantity by applying a calibration to the second signal;
      
      generating a second quantity by applying the delay to the first signal;
      
      forming the filter as a ratio of the first quantity to the second quantity.
  - 28. The device of claim 27, wherein the generating of the energy ratio comprises generating the energy ratio for a frequency band.
  - 29. The device of claim 27, wherein the generating of the energy ratio comprises generating the energy ratio for a frequency subband.
  - 30. The device of claim 29, wherein the frequency subband includes frequencies higher than approximately 200 Hertz (Hz).
  - 31. The device of claim 29, wherein the frequency subband includes frequencies in a range from approximately 250 Hz to 1250 Hz.
  - 32. The device of claim 29, wherein the frequency subband includes frequencies in a range from approximately 200 Hz to 3000 Hz.
  - 33. The device of claim 22, wherein the filter is a static filter.
  - 34. The device of claim 33, wherein the forming of the filter comprises:
    - determining a first distance as distance between the first physical microphone and a mouth of the speaker;
      
      determining a second distance as distance between the second physical microphone and the mouth; and
      
      forming a ratio of the first distance to the second distance.
  - 35. The device of claim 14, comprising generating a vector of the energy ratio versus time.
  - 36. The device of claim 14, wherein the first virtual microphone and the second virtual microphone are distinct virtual directional microphones.
  - 37. The device of claim 36, wherein the first virtual microphone and the second virtual microphone have approximately similar responses to noise.
  - 38. The device of claim 37, wherein the first virtual microphone and the second virtual microphone have approximately dissimilar responses to speech.
  - 39. The device of claim 14, wherein the first and second physical microphones are omnidirectional microphones.
  - 40. The device of claim 14, comprising positioning the first physical microphone and the second physical microphone along an axis and separating the first physical microphone and the second physical microphone by a first distance.
  - 41. The device of claim 40, wherein a midpoint of the axis is a second distance from a mouth of the speaker, wherein the mouth is located in a direction defined by an angle relative to the midpoint.

42. A device comprising:
- a headset including at least one loudspeaker, wherein the headset attaches to a region of a human head;
  
  a microphone array connected to the headset, the microphone array including a first physical microphone outputting a first signal and a second physical microphone outputting a second signal; and
  
  a processing component coupled to the first physical microphone and the second physical microphone, the processing component forming a first virtual microphone, the processing component forming a filter that describes a relationship for speech between the first physical microphone and the second physical microphone, the processing component forming a second virtual microphone by applying the filter to the first signal to generate a first intermediate signal, and summing the first intermediate signal and the second signal, the processing component detecting acoustic voice activity of a speaker when an energy ratio of energies of the first virtual microphone and the second virtual microphone is greater than a threshold value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Jawbone Innovations, LLC
Original Assignee
AliphCom, Inc. (AliphCom Corp.)
Inventors
Jing, Zhinian, Petit, Nicolas, Burnett, Gregory

Granted Patent

US 8,321,213 B2
Time in Patent Office

Days
Field of Search
US Class Current

381/92
CPC Class Codes

G10L 2021/02165 Two microphones, one receiv...

G10L 25/93 Discriminating between voic...

Acoustic Voice Activity Detection (AVAD) for Electronic Systems

First Claim

19 Assignments

0 Petitions

Accused Products

Abstract

Citations

42 Claims

Specification

Solutions

Use Cases

Quick Links

Acoustic Voice Activity Detection (AVAD) for Electronic Systems

First Claim

19 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

42 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links