Multi-microphone audio source separation based on combined statistical angle distributions

US 9,131,295 B2
Filed: 08/07/2012
Issued: 09/08/2015
Est. Priority Date: 08/07/2012
Status: Active Grant

First Claim

Patent Images

1. One or more computer-readable memory or storage devices storing instructions that, when executed by a computing device having a processor, perform a method of separating audio sources in a multi-microphone system, the method comprising:

receiving audio sample groups, with an audio sample group comprising at least two samples of audio information, the at least two samples captured by different microphones during a sample group time interval; and

for a plurality of audio sample groups;

estimating, for the corresponding sample group time interval, an angle between a first reference line extending from an audio source to the multi-microphone system and a second reference line extending through the multi-microphone system, the estimated angle being based on a phase difference between the at least two samples in the audio sample group;

modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and

determining whether the audio sample group is part of a target audio signal or a noise component based at least in part on the combined statistical distribution.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods, and computer media for separating audio sources in a multi-microphone system are provided. A plurality of audio sample groups can be received. Each audio sample group comprises at least two samples of audio information captured by different microphones during a sample group time interval. For each audio sample group, an estimated angle between an audio source and the multi-microphone system can be estimated based on a phase difference of the samples in the group. The estimated angle can be modeled as a combined statistical distribution that is a mixture of a target audio signal statistical distribution and a noise component statistical distribution. The combined statistical distribution can be analyzed to provide an accurate characterization of each sample group as either target audio signal or noise. The target audio signal can then be resynthesized from samples identified as part of the target audio signal.

Citations

24 Claims

1. One or more computer-readable memory or storage devices storing instructions that, when executed by a computing device having a processor, perform a method of separating audio sources in a multi-microphone system, the method comprising:
- receiving audio sample groups, with an audio sample group comprising at least two samples of audio information, the at least two samples captured by different microphones during a sample group time interval; and
  
  for a plurality of audio sample groups;
  
  estimating, for the corresponding sample group time interval, an angle between a first reference line extending from an audio source to the multi-microphone system and a second reference line extending through the multi-microphone system, the estimated angle being based on a phase difference between the at least two samples in the audio sample group;
  
  modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and
  
  determining whether the audio sample group is part of a target audio signal or a noise component based at least in part on the combined statistical distribution.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The one or more computer-readable memory or storage devices of claim 1, further comprising resynthesizing a target audio signal from the audio sample groups determined to be part of the target audio signal.
  - 3. The one or more computer-readable memory or storage devices of claim 1, wherein the multi-microphone system is a two-microphone system, and wherein the audio sample groups are audio sample pairs.
  - 4. The one or more computer-readable memory or storage devices of claim 1, wherein determining whether the audio sample group is part of the target audio signal or the noise component comprises comparing the combined statistical distribution to a fixed threshold.
  - 5. The one or more computer-readable memory or storage devices of claim 1, wherein determining whether the audio sample group is part of the target audio signal or the noise component comprises performing statistical analysis.
  - 6. The one or more computer-readable memory or storage devices of claim 5, wherein the statistical analysis comprises hypothesis testing.
  - 7. The one or more computer-readable memory or storage devices of claim 6, wherein the hypothesis testing is maximum a posteriori (MAP) hypothesis testing.
  - 8. The one or more computer-readable memory or storage devices of claim 6, wherein the hypothesis testing is maximum likelihood testing.
  - 9. The one or more computer-readable memory or storage devices of claim 1, wherein the target audio signal statistical distribution and the noise component statistical distribution are von Mises distributions.
  - 10. The one or more computer-readable memory or storage devices of claim 1, wherein the combined statistical distribution is represented by the equation f_T(θ
    - )=c₀[m]f₀(θ
      
      )+c₁[m]f₁(θ
      
      ), where m is a sample group index, f₀(θ
      
      )is a noise component distribution, f₁(θ
      
      ) is a target audio signal distribution, c₀[m] and c₁[m] are mixture coefficients, and c₀[m]+c₁[m]=1.
  - 11. The one or more computer-readable memory or storage devices of claim 1, wherein parameters for the combined statistical distribution are obtained using an expectation maximization (EM) algorithm.
  - 12. The one or more computer-readable memory or storage devices of claim 1, wherein an initial threshold for distinguishing target audio signal from noise component is a pre-determined fixed value.
  - 13. The one or more computer-readable memory or storage devices of claim 1, wherein the second reference line is perpendicular to a third reference line extending between the first and second microphones, and wherein the first reference line and the second reference line intersect at the approximate midpoint of the third reference line.
  - 14. The one or more computer-readable memory or storage devices of claim 1, wherein the sample group time intervals are about approximately between 50 and 125 milliseconds.

15. A multi-microphone mobile device having audio source-separation capabilities, the mobile device comprising:
- a first microphone;
  
  a second microphone;
  
  a processor;
  
  an angle estimator configured to, by the processor, for a sample pair time interval, estimate an angle between a first reference line extending from an audio source to the mobile device and a second reference line extending through the mobile device, the estimated angle being based on a phase difference between a first sample and a second sample in an audio sample pair captured during the sample pair time interval, wherein the first sample is captured by the first microphone and the second sample is captured by the second microphone;
  
  a combined statistical modeler configured to model the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and
  
  a sample classifier configured to determine whether the audio sample pair is part of a target audio signal or a noise component based at least in part on the combined statistical distribution.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The multi-microphone mobile device of claim 15, wherein the mobile device is a mobile phone.
  - 17. The multi-microphone mobile device of claim 15, wherein the sample classifier is further configured to determine whether the audio sample pair is part of the target audio signal or the noise component by performing statistical analysis.
  - 18. The multi-microphone mobile device of claim 17, wherein the statistical analysis comprises at least one of maximum a posteriori (MAP) hypothesis testing or maximum likelihood testing.
  - 19. The multi-microphone mobile device of claim 15, wherein the sample classifier is further configured to determine whether the audio sample pair is part of the target audio signal or the noise component by comparing the combined statistical distribution to a fixed threshold.
  - 20. The multi-microphone mobile device of claim 15, wherein the second reference line is perpendicular to a third reference line extending between the first and second microphones, and wherein the first reference line and the second reference line intersect at an approximate midpoint of the third reference line.
  - 21. The multi-microphone mobile device of claim 15, wherein the target audio signal statistical distribution and the noise component statistical distribution are von Mises distributions, and wherein the combined statistical modeler is further configured to determine parameters for the combined statistical distribution using an expectation maximization (EM) algorithm.

22. A method of providing a target audio signal through audio source separation in a two-microphone system, the method comprising:
- receiving audio sample pairs, with an audio sample pair comprising a first sample of audio information captured by a first microphone during a sample pair time interval and a second sample of audio information captured by a second microphone during the sample pair time interval;
  
  for a plurality of audio sample pairs;
  
  estimating, for the corresponding sample pair time interval, an angle between a first reference line extending from an audio source to the two-microphone system and a second reference line extending through the two-microphone system, the estimated angle being based on a phase difference between the first and second samples of audio information;
  
  modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal von Mises distribution and a noise component von Mises distribution; and
  
  performing hypothesis testing statistical analysis on the combined statistical distribution to determine whether the audio sample pair is part of the target audio signal or the noise component; and
  
  resynthesizing a target audio signal from the audio sample pairs determined to be part of the target audio signal.
- View Dependent Claims (23, 24)
- - 23. The method of claim 22, wherein the hypothesis testing is one of maximum a posteriori (MAP) hypothesis testing or maximum likelihood testing.
  - 24. The method of claim 22, wherein parameters for the combined statistical distribution are obtained using an expectation maximization (EM) algorithm.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Kim, Chanwoo, Khawand, Charbel
Primary Examiner(s)
KIM, PAUL

Application Number

US13/569,092
Publication Number

US 20140044279A1
Time in Patent Office

1,127 Days
Field of Search

381/92, 381/122, 381/94.1, 381/91, 381/56, 381/82, 381/58
US Class Current

1/1
CPC Class Codes

G10L 21/0272   Voice signal separating

H04R 2227/003   Digital PA systems using, e...

H04R 2227/009   Signal processing in [PA] s...

H04R 27/00   Public address systems circ...

H04R 3/005   for combining the signals o...

Multi-microphone audio source separation based on combined statistical angle distributions

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-microphone audio source separation based on combined statistical angle distributions

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links