Multi-microphone audio source separation based on combined statistical angle distributions
First Claim
1. One or more computer-readable memory or storage devices storing instructions that, when executed by a computing device having a processor, perform a method of separating audio sources in a multi-microphone system, the method comprising:
- receiving audio sample groups, with an audio sample group comprising at least two samples of audio information, the at least two samples captured by different microphones during a sample group time interval; and
for a plurality of audio sample groups;
estimating, for the corresponding sample group time interval, an angle between a first reference line extending from an audio source to the multi-microphone system and a second reference line extending through the multi-microphone system, the estimated angle being based on a phase difference between the at least two samples in the audio sample group;
modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and
determining whether the audio sample group is part of a target audio signal or a noise component based at least in part on the combined statistical distribution.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods, and computer media for separating audio sources in a multi-microphone system are provided. A plurality of audio sample groups can be received. Each audio sample group comprises at least two samples of audio information captured by different microphones during a sample group time interval. For each audio sample group, an estimated angle between an audio source and the multi-microphone system can be estimated based on a phase difference of the samples in the group. The estimated angle can be modeled as a combined statistical distribution that is a mixture of a target audio signal statistical distribution and a noise component statistical distribution. The combined statistical distribution can be analyzed to provide an accurate characterization of each sample group as either target audio signal or noise. The target audio signal can then be resynthesized from samples identified as part of the target audio signal.
-
Citations
24 Claims
-
1. One or more computer-readable memory or storage devices storing instructions that, when executed by a computing device having a processor, perform a method of separating audio sources in a multi-microphone system, the method comprising:
-
receiving audio sample groups, with an audio sample group comprising at least two samples of audio information, the at least two samples captured by different microphones during a sample group time interval; and for a plurality of audio sample groups; estimating, for the corresponding sample group time interval, an angle between a first reference line extending from an audio source to the multi-microphone system and a second reference line extending through the multi-microphone system, the estimated angle being based on a phase difference between the at least two samples in the audio sample group; modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and determining whether the audio sample group is part of a target audio signal or a noise component based at least in part on the combined statistical distribution. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A multi-microphone mobile device having audio source-separation capabilities, the mobile device comprising:
-
a first microphone; a second microphone; a processor; an angle estimator configured to, by the processor, for a sample pair time interval, estimate an angle between a first reference line extending from an audio source to the mobile device and a second reference line extending through the mobile device, the estimated angle being based on a phase difference between a first sample and a second sample in an audio sample pair captured during the sample pair time interval, wherein the first sample is captured by the first microphone and the second sample is captured by the second microphone; a combined statistical modeler configured to model the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal statistical distribution and a noise component statistical distribution; and a sample classifier configured to determine whether the audio sample pair is part of a target audio signal or a noise component based at least in part on the combined statistical distribution. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A method of providing a target audio signal through audio source separation in a two-microphone system, the method comprising:
-
receiving audio sample pairs, with an audio sample pair comprising a first sample of audio information captured by a first microphone during a sample pair time interval and a second sample of audio information captured by a second microphone during the sample pair time interval; for a plurality of audio sample pairs; estimating, for the corresponding sample pair time interval, an angle between a first reference line extending from an audio source to the two-microphone system and a second reference line extending through the two-microphone system, the estimated angle being based on a phase difference between the first and second samples of audio information; modeling the estimated angle as a combined statistical distribution, the combined statistical distribution being a mixture of a target audio signal von Mises distribution and a noise component von Mises distribution; and performing hypothesis testing statistical analysis on the combined statistical distribution to determine whether the audio sample pair is part of the target audio signal or the noise component; and resynthesizing a target audio signal from the audio sample pairs determined to be part of the target audio signal. - View Dependent Claims (23, 24)
-
Specification