Real-time audio source separation by delay and attenuation compensation in the time domain

US 7,088,831 B2
Filed: 12/06/2001
Issued: 08/08/2006
Est. Priority Date: 12/06/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method for separating at least two audio channels recorded using an array of at least two microphones comprising the steps of:

equalizing variances of a first channel and a second channel on a current data frame;

recursively expressing means and variances of mixtures;

normalizing the second channel to a variance level substantially similar to a variance of the first channel; and

determining delay parameters by minimizing a cross-covariance between two outputs.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system is provided for separating two audio channels recorded by an array of microphones. The system includes a calibration module for normalizing gain levels between a plurality of channels on each of a plurality of date frames, wherein each data frame is expressed in terms of time. The system further includes a delay parameter estimation module for accepting an output comprising the normalized channels, and estimating a delay parameter for a plurality of data frame sizes over a plurality of lag times, and sorting delays to generate corresponding source separated outputs.

Citations

15 Claims

1. A method for separating at least two audio channels recorded using an array of at least two microphones comprising the steps of:
- equalizing variances of a first channel and a second channel on a current data frame;
  
  recursively expressing means and variances of mixtures;
  
  normalizing the second channel to a variance level substantially similar to a variance of the first channel; and
  
  determining delay parameters by minimizing a cross-covariance between two outputs.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein on a current block of m data samples x_j(t), 1≦
    - t≦
      
      m 1≦
      
      j≦
      
      2, and index k, a current block mean {overscore (x)}_jcan be determined according to;
      
      ${\overline{x}}_{j} = \frac{1}{m} \sum_{t = 1}^{m} x_{j} (t)$
  - 3. The method of claim 1, wherein a running mean {overscore (x)}_j^(k−
    - 1) can be updated by;
      
      {overscore (x)}_j^(k)=(1−
      
      β
      
      ){overscore (x)}_j^(k−
      
      1)+β
      
      {overscore (x)}_jwhere β
      
      is a learning rate.
  - 4. The method of claim 1, wherein a current block variance Var_jis determined according to:
    - ${Var}_{j} = \frac{1}{m} \sum_{t = 1}^{m} {\langle x_{j} (t) - {\overline{x}}_{j}^{(k)} \rangle}^{2}$
  - 5. The method of claim 1, wherein a running variance v_j^(k−
    - 1) is updated by;
      
      v_j^(k)=(1−
      
      β
      
      )v_j^(k−
      
      1)+β
      
      Var_j
  - 6. The method of claim 1, wherein the step of normalizing the second channel further comprises normalizing an average energy to be similar to an average energy of the first channel according to:
    - ${\hat{x}}_{2} = \sqrt{\frac{ν_{1}^{(k)}}{ν_{2}^{(k)}}} x_{2}$
  - 7. The method of claim 1, wherein the cross-covariance between the outputs is expanded as:
    - R_y₁_y₂(τ
      
      )=R_x₁_x₁(d₁−
      
      d₂+τ
      
      )−
      
      R_x₁_x₂(d₂−
      
      τ
      
      )−
      
      R_x₁_x₂(d₁+τ
      
      )+R_x₂_x₂(τ
      
      )where R_x_i_x_jis the cross-correlation between x_iand x_j, 1≦
      
      i, j≦
      
      2.
  - 8. The method of claim 1, further comprising the step of determining sub-unit-delayed versions of cross-correlations, wherein the delay parameters are determined for a number of lags L.

9. A method for separating at least two audio channels recorded using an array of at least two microphones comprising the steps of:
- constraining a mixing model of the at least two audio channels in a time domain to direct path signal components;
  
  defining a plurality of delays with respect to a midpoint between microphones, wherein delays depend on the distance between sensors and the speed of sound;
  
  inverting a mixing matrix, corresponding to the mixing model, in the frequency domain; and
  
  compensating for a plurality of true fractional delays and attenuations in the time domain, wherein values of the delays and attenuations are determined from an output decorrelation constraint.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The method of claim 9, further comprising the step of estimating a complex filter for each microphone, wherein the complex filters define the mixing model.
  - 11. The method of claim 9, wherein the mixing matrix corresponding to the mixing model comprises two delay parameters and two parameters corresponding to the speed of sound.
  - 12. The method of claim 9, wherein the output decorrelation constraint is a function of two unknown delays and unknown scalar coefficients.
  - 13. The method of claim 12, wherein the unknown scalar coefficients are attenuation coefficients substantially equal to one.
  - 14. The method of claim 9, further comprising the step of imposing a minimum variance criterion for a reverberant case over all linear filtering combinations of X₁and X₂.

15. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for separating at least two audio channels recorded using an array of at least two microphones, the method steps comprising:
- equalizing variances of a first channel and a second channel on a current data frame;
  
  recursively expressing means and variances of mixtures;
  
  normalizing the second channel to a variance level substantially similar to a variance of the first channel; and
  
  determining delay parameters by minimizing a cross-covariance between two outputs.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Siemens Corp. (Siemens AG)
Original Assignee
Siemens Corporate Research Incorporated (Siemens AG)
Inventors
Rosca, Justinian, Balan, Radu Victor, Fan, Ning Ping
Primary Examiner(s)
Pendleton, Brian T.

Application Number

US10/010,255
Publication Number

US 20030112983A1
Time in Patent Office

1,706 Days
Field of Search

381/66, 381/92, 702/190
US Class Current

381/92
CPC Class Codes

G06F 18/2134 based on separation criteri...

H04R 3/005 for combining the signals o...

Real-time audio source separation by delay and attenuation compensation in the time domain

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Real-time audio source separation by delay and attenuation compensation in the time domain

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links