Coordination of beamformers for noise estimation and noise suppression

US 10,482,899 B2
Filed: 08/01/2016
Issued: 11/19/2019
Est. Priority Date: 08/01/2016
Status: Active Grant

First Claim

Patent Images

1. A process for adaptively selecting two or more beams from among a plurality of acoustic pickup beams that are produced by a beamforming process using a plurality of microphone signals from a plurality of microphones, the process comprising:

producing the plurality of acoustic pickup beams based on groups of the plurality of microphones, wherein the groups are determined based on an estimation of voice activity and an estimation of noise characteristics in the microphones signals; and

selecting the two or more beams from among the plurality of acoustic pickup beams, including a voice beam and a noise beam, based on thresholds for voice-separation and thresholds for noise-matching, whereinduring a period where a desired voice is deemed active, indicating presence of speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for voice separation to determine whether there is sufficiently large voice separation between the noise beam and the voice beam, andduring a period where the desired voice is deemed inactive, indicating non-speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for noise-matching to determine whether there is sufficient noise matching between the noise beam and the voice beam, andwherein the voice beam is used to pick up a voice signal and the noise beam is used to provide information to estimate a noise signal; and

whereinit is determined whether the two or more beams meet the threshold for noise-matching by a) obtaining ratios between the strength of a component of the noise beam in the noise beam and a strength of a component of the voice beam over a time interval, b) comparing the ratios to the threshold for noise-matching, and c) if the threshold for noise-matching is met, setting a correction factor for noise-matching; and

it is determined whether the two or more beams meet the threshold for voice separation by calculating adjusted ratios by applying the correction factor to initial ratios between the strength of a component of the noise beam and the strength of a component of the voice beam.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio system has a housing in which are integrated a number of microphones. A programmed processor accesses the microphone signals and produces a number of acoustic pick up beams based groups of microphones, an estimation of voice activity and an estimation of noise characteristics on each beam. Two or more beams including a voice beam that is used to pick up a desired voice and a noise beam that is used to provide information to estimate ambient noise are adaptively selected from among the plurality of beams, based on thresholds for voice separation and thresholds for noise-matching. Other embodiments are also described and claimed.

60 Citations

View as Search Results

29 Claims

1. A process for adaptively selecting two or more beams from among a plurality of acoustic pickup beams that are produced by a beamforming process using a plurality of microphone signals from a plurality of microphones, the process comprising:
- producing the plurality of acoustic pickup beams based on groups of the plurality of microphones, wherein the groups are determined based on an estimation of voice activity and an estimation of noise characteristics in the microphones signals; and
  
  selecting the two or more beams from among the plurality of acoustic pickup beams, including a voice beam and a noise beam, based on thresholds for voice-separation and thresholds for noise-matching, whereinduring a period where a desired voice is deemed active, indicating presence of speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for voice separation to determine whether there is sufficiently large voice separation between the noise beam and the voice beam, andduring a period where the desired voice is deemed inactive, indicating non-speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for noise-matching to determine whether there is sufficient noise matching between the noise beam and the voice beam, andwherein the voice beam is used to pick up a voice signal and the noise beam is used to provide information to estimate a noise signal; and
  
  whereinit is determined whether the two or more beams meet the threshold for noise-matching by a) obtaining ratios between the strength of a component of the noise beam in the noise beam and a strength of a component of the voice beam over a time interval, b) comparing the ratios to the threshold for noise-matching, and c) if the threshold for noise-matching is met, setting a correction factor for noise-matching; and
  
  it is determined whether the two or more beams meet the threshold for voice separation by calculating adjusted ratios by applying the correction factor to initial ratios between the strength of a component of the noise beam and the strength of a component of the voice beam.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The process of claim 1, wherein:
    - the ratios are instantaneous and average ratios between a strength of a noise component in the noise beam and a strength of a noise component in the voice beam and the correction factor is a computed statistical central tendency of the instantaneous and average ratios.
  - 3. The process of claim 2, wherein determining whether the two or more beams meet the thresholds threshold for voice-separation further includes:
    - comparing the adjusted ratios to the threshold for voice separation, wherein the adjusted ratios are instantaneous and average adjusted ratios.
  - 4. The process of claim 1, wherein production of the plurality of acoustic pickup beams further comprises coordinating one or more of the following parameters:
    - i) a shape of the voice beam, ii) a direction of the voice beam, iii) a shape of the noise beam, iv) a direction of the noise beam, v) a subset of microphones among the plurality of microphones used to generate the voice beam; and
      
      vi) a subset of microphones among the plurality of microphones used to generate the noise beam.
  - 5. The Process of claim 1, wherein the plurality of microphones comprises a cluster, and wherein in a case where there are two or more clusters, the clusters are spatially separated.
  - 6. The process of claim 5, wherein production of the plurality of acoustic pickup beams further comprises, in the case where there are two or more clusters, assigning a voice beam to a cluster and assigning a noise beam to a different cluster.
  - 7. The process of claim 5, wherein the cluster is integrated into an enclosure of a mobile phone, tablet computer or laptop computer.
  - 8. The process of claim 1, further comprising:
    - providing the voice beam included in the selected beams as a voice input signal to a multi-channel noise suppression process; and
      
      providing the noise beam included in the selected beams as a noise reference signal to the multi-channel noise suppression process.
  - 9. The process of claim 1, further comprising:
    - providing the voice beam included in the selected beams as a voice input signal to a voice activity detector, andproviding the noise beam included in the selected beams as a noise reference signal to the voice activity detector.
  - 10. The process of claim 1, wherein directions of the voice signal and the noise signal are estimated and used in design and selection of the beams.
  - 11. The process of claim 10, wherein the directions of the voice signal and the noise signal are estimated using a blind source estimation process to obtain directions of sources of the voice signal and the noise signal, respectively.
  - 12. The process of claim 1, wherein the strength is a computed statistical central tendency of energy or power of a noise component in the noise beam or in the voice beam, over a predefined frequency band, in a given digital audio frame.
  - 13. The process of claim 1, wherein the strength is a computed statistical central tendency of energy or power of the noise beam or the voice beam, over a predefined frequency band, in a given digital audio frame.
  - 14. The process of claim 1, wherein the selected voice beam and the selected noise beam comprise a beam pair, andwherein if more than one beam pair satisfies the thresholds for voice separation and the thresholds for noise-matching, the beam pair including the voice beam having a highest signal-to-noise ratio is selected.
  - 15. The process of claim 1, wherein the selected voice beam and the selected noise beam comprise a beam pair, andwherein if no beam pair satisfies the thresholds for voice separation and the thresholds for noise-matching, a single-channel noise suppression process is performed.

16. An audio system, comprising:
- a housing having integrated therein a plurality of microphones having a fixed geometrical relationship to each other;
  
  a processor to access a plurality of microphone signals produced by the plurality of microphones, respectively; and
  
  memory having stored therein instructions that when executed by the processor (a)produce a plurality of acoustic pickup beams based on groups of the plurality of microphones,wherein the groups are determined based on an estimation of voice activity, and an estimation of noise characteristics in the microphone signals, and (b) select two or more beams, including a voice beam and a noise beam, from among the plurality of acoustic pickup beams based on thresholds for voice separation and thresholds for noise-matching,wherein selecting the voice beam and the noise beam, comprises,during a period where a desired voice is deemed active, indicating a presence of speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for voice separation to determine whether there is sufficiently large voice separation between the two or more beams, andduring a period where the desired voice is deemed inactive, indicating non-speech, difference between a strength of a component of the noise beam and a strength of a component of the voice beam are compared to a threshold for noise-matching to determine whether there is sufficient noise matching between the two or more beams, andwherein the voice beam is selected and used to pick up a voice signal and the noise beam is selected and used to provide information to estimate a noise signal, and whereinit is determined whether the two or more beams meet the threshold for noise-matching by a) obtaining ratios between the strength of a component of the noise beam in the noise beam and a strength of a component of the voice beam over a time interval, b) comparing the ratios to the threshold for noise-matching, and c) if the threshold for noise-matching is met, setting a correction factor for noise-matching; and
  
  it is determined whether the two or more beams meet the threshold for voice separation by calculating adjusted ratios by applying the correction factor to initial ratios between the strength of a component of the noise beam and the strength of a component of the voice beam.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
- - 17. The system of claim 16, whereinthe ratios are instantaneous and average ratios between a strength of a noise component in the noise beam and a strength of a noise component in the voice beam over a time interval;
    - and the correction factor is a computed statistical central tendency of the instantaneous and average ratios.
  - 18. The system of claim 17, wherein determining whether the two or more beams meet the thresholds for voice-separation further includes:
    - comparing the adjusted ratios to the thresholds for voice separation, wherein the adjusted ratios are instantaneous and average adjusted ratios.
  - 19. The system of claim 16, wherein the memory has stored therein instructions that, when executed by the processor, produce the plurality of acoustic pickup beams by coordinating one or more of the following parameters:
    - i) a shape of the voice beam, ii) a direction of the voice beam, iii) a shape of the noise beam, iv) a direction of the noise beam, v) a subset of microphones among the plurality of microphones used to generate the voice beam; and
      
      vi) a subset of microphones among the plurality of microphones used to generate the noise beam.
  - 20. The system of claim 16, wherein the plurality of microphones comprises a cluster, and wherein in a case where there are two or more clusters, the clusters are spatially separated.
  - 21. The system of claim 20, wherein the memory has stored therein instructions that, when executed by the processor, produce of the plurality of beams by, in the case where there are two or more clusters, assigning a voice beam to a cluster and assigning a noise beam to a different cluster.
  - 22. The system of claim 16, wherein the memory has stored therein instructions that, when executed by the processor, provide the voice beam included in the selected beams as a voice input signal to a multi-channel noise suppression process and provide the noise beam included in the selected beams as a noise reference signal to the multi-channel noise suppression process.
  - 23. The system of claim 16, wherein the memory has stored therein instructions that, when executed by the processor, provide the voice beam included in the selected beams as a voice input signal to a voice activity detector, and provide the noise beam included in the selected beams as a noise reference signal to the voice activity detector.
  - 24. The system of claim 16, wherein directions of the voice signal and the noise signal are estimated and used in design and selection of the beams.
  - 25. The system of claim 24, wherein the directions of the voice signal and the noise signal are estimated using a blind source estimation process to obtain directions of sources of the voice signal and the noise signal, respectively.
  - 26. The system of claim 16, wherein the strength is a computed statistical central tendency of energy or power of a noise component in the noise beam or in the voice beam, over a predefined frequency band, in a given digital audio frame.
  - 27. The system of claim 16, wherein the strength is a computed statistical central tendency of energy or power of the noise beam or the voice beam, over a predefined frequency band, in a given digital audio frame.
  - 28. The system of claim 16, wherein the selected voice beam and the selected noise beam comprise a beam pair, andwherein if more than one beam pair satisfies the thresholds for voice separation and the thresholds for noise-matching, the beam pair including the voice beam having a highest signal-to-noise ratio is selected.
  - 29. The system of claim 16, wherein the selected voice beam and the selected noise beam comprise a beam pair, andwherein if no beam pair satisfies the thresholds for voice separation and the thresholds for noise-matching, the processor executes a single-channel noise suppression process.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Ramprashad, Sean A., Andersen, Esge B., Atkins, Joshua D., Dusan, Sorin V., Iyengar, Vasu, Pruthi, Tarun, Theverapperuma, Lalin S.
Primary Examiner(s)
Sarpong, Akwasi M

Application Number

US15/225,707
Publication Number

US 20180033447A1
Time in Patent Office

1,205 Days
Field of Search

704226, 381 714, 381 86
US Class Current
CPC Class Codes

G10L 2021/02165   Two microphones, one receiv...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0216   characterised by the method...

G10L 21/028   using properties of sound s...

G10L 25/21   the extracted parameters be...

Coordination of beamformers for noise estimation and noise suppression

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

60 Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Coordination of beamformers for noise estimation and noise suppression

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

60 Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links