APPARATUS AND METHOD FOR PROVIDING AN INFORMED MULTICHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION

US 20150310857A1
Filed: 03/03/2015
Published: 10/29/2015
Est. Priority Date: 09/03/2012
Status: Active Grant

First Claim

Patent Images

1. An apparatus for providing a speech probability estimation, comprising:

a first speech probability estimator for estimating speech probability information indicating a first probability on whether a sound field of a scene comprises speech or on whether the sound field of the scene does not comprise speech, andan output interface for outputting the speech probability estimation depending on the speech probability information,wherein the first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus for providing a speech probability estimation is provided. The apparatus includes a first speech probability estimator for estimating speech probability information indicating a first probability on whether a sound field of a scene includes speech or on whether the sound field of the scene does not include speech. Moreover, the apparatus includes an output interface for outputting the speech probability estimation depending on the speech probability information. The first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene.

Citations

20 Claims

1. An apparatus for providing a speech probability estimation, comprising:
- a first speech probability estimator for estimating speech probability information indicating a first probability on whether a sound field of a scene comprises speech or on whether the sound field of the scene does not comprise speech, andan output interface for outputting the speech probability estimation depending on the speech probability information,wherein the first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene.

2. An apparatus according to claim 1,wherein the apparatus furthermore comprises a second speech probability estimator for estimating the speech probability estimation indicating a second probability on whether the sound field comprises speech or on whether the sound field does not comprise speech,wherein the second speech probability estimator is configured to estimate the speech probability estimation based on the speech probability information estimated by the first speech probability estimator, and based on one or more acoustic sensor signals, which depend on the sound field.

3. An apparatus according to claim 1,wherein the first speech probability estimator is configured to estimate the speech probability information based on directionality information, wherein the directionality information indicates how directional sound of the sound field is,wherein the first speech probability estimator is configured to estimate the speech probability information based on location information, wherein the location information indicates at least one location of a sound source of the scene, orwherein the first speech probability estimator is configured to estimate the speech probability information based on proximity information, wherein the proximity information indicates at least one proximity of at least one possible sound object to at least one proximity sensor.

4. An apparatus according to claim 1, wherein the first speech probability estimator is configured to estimate the speech probability estimation by determining a direct-to-diffuse ratio estimation of a direct-to-diffuse ratio as the spatial information, the direct-to-diffuse ratio indicating a ratio of direct sound comprised by the acoustic sensor signals to diffuse sound comprised by the acoustic sensor signals.

5. An apparatus according to claim 4,wherein the first speech probability estimator is configured to determine the direct-to-diffuse ratio estimation by determining a coherence estimation of a complex coherence between a first acoustic signal of the acoustic sensor signals, the first acoustic signal being recorded by a first acoustic sensor p, and a second acoustic signal of the acoustic sensor signals, the second acoustic signal being recorded by a second acoustic sensor q, andwherein the first speech probability estimator is moreover configured to determine the direct-to-diffuse ratio based on a phase shift estimation of a phase shift of the direct sound between the first acoustic signal and the second acoustic signal.

6. An apparatus according to claim 5,wherein the first speech probability estimator is configured to determine the direct-to-diffuse ratio estimation {circumflex over (Γ
- )}(k, n) between the first acoustic signal and the second acoustic signal by applying the formula;

7. An apparatus according to claim 4, wherein the first speech probability estimator is configured to estimate the speech probability information by determining
ƒ
- [{circumflex over (Γ
  
  )}(k, n)]wherein {circumflex over (Γ
  
  )}(k, n) is the direct-to-diffuse ratio estimation, andwherein ƒ
  
  [{circumflex over (Γ
  
  )}(k, n)] is a mapping function representing a mapping of the direct-to-diffuse ratio estimation to a value between 0 and 1.

8. An apparatus according to claim 7, wherein the mapping function ƒ
- [{circumflex over (Γ
  
  )}(k, n)] is defined by the formula;

9. An apparatus according to claim 1, wherein the first speech probability estimator is configured to determine a location parameter P_bbased on a probability distribution of an estimated location of a sound source and based on an area of interest to acquire the speech probability information.

10. An apparatus according to claim 9, wherein the first speech probability estimator is configured to determine the location parameter P_bby employing the formula

11. An apparatus according to claim 4,wherein the first speech probability estimator is configured to determine the a priori speech presence probability q(k, n) as the speech probability information by applying the formula:

12. An apparatus according to claim 1, wherein the first speech probability estimator is configured to determine a proximity parameter as the spatial information,wherein the proximity parameter comprises a first parameter value, when the first speech probability estimator detects one or more possible sound sources within a predefined distance from a proximity sensor, and wherein the proximity parameter comprises a second parameter value, being smaller than the first parameter value, when the first speech probability estimator does not detect possible sound sources in the direct proximity of the proximity sensor, andwherein the first speech probability estimator is configured to determine a first speech probability value as the speech probability information when the proximity parameter comprises the first parameter value, and wherein the first speech probability estimator is configured to determine a second speech probability value as the speech probability information when the proximity parameter comprises the second parameter value, the first speech probability value indicating a first probability that the sound field comprises speech, wherein the first probability is greater than a second probability that the sound field comprises speech, the second probability being indicated by the second speech probability value.

13. An apparatus for determining a noise power spectral density estimation, comprising:
- a first speech probability estimator for estimating speech probability informationindicating a first probability on whether a sound field of a scene comprises speech or on whether the sound field of the scene does not comprise speech, andan output interface for outputting the speech probability estimation depending on the speech probability information,wherein the first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene, and a noise power spectral density estimation unit,wherein the apparatus is configured to provide the speech probability estimation to the noise power spectral density estimation unit, andwherein the noise power spectral density estimation unity is configured to determine the noise power spectral density estimation based on the speech probability estimation and a plurality of input audio channels.

14. An apparatus according to claim 13,wherein the apparatus is configured to compute one or more spatial parameters, the one or more spatial parameters indicating spatial information about the sound field,wherein the apparatus is configured to compute the speech probability estimation by employing the one or more spatial parameters, andwherein the noise power spectral density estimation unit is configured to determine the noise power spectral density estimation by updating a previous noise power spectral density matrix depending on the speech probability estimation to acquire an updated noise power spectral density matrix as the noise power spectral density estimation.

15. An apparatus for estimating a steering vector, comprising:
- a first speech probability estimator for estimating speech probability information indicating a first probability on whether a sound field of a scene comprises speech or on whether the sound field of the scene does not comprise speech, andan output interface for outputting the speech probability estimation depending on the speech probability information,wherein the first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene, anda steering vector estimation unit,wherein the apparatus is configured to provide the speech probability estimation to the steering vector estimation unit, andwherein the steering vector estimation unit is configured to estimate the steering vector based on the speech probability estimation and a plurality of input audio channels.

16. An apparatus for multichannel noise reduction, comprising:
- a first speech probability estimator for estimating speech probability informationindicating a first probability on whether a sound field of a scene comprises speech or on whether the sound field of the scene does not comprise speech, andan output interface for outputting the speech probability estimation depending on the speech probability information,wherein the first speech probability estimator is configured to estimate the first speech probability information based on at least spatial information about the sound field or spatial information on the scene, anda filter unit,wherein the filter unit is configured to receive a plurality of audio input channels,wherein the apparatus is configured to provide the speech probability information to the filter unit, andwherein the filter unit is configured to filter the plurality of audio input channels to acquire filtered audio channels based on the speech probability information.

17. An apparatus according to claim 16, wherein the first speech probability estimator is configured to generate a tradeoff parameter, wherein the tradeoff parameter depends on at least one spatial parameter indicating spatial information about the sound field or spatial information on the scene.

18. An apparatus according to claim 17, wherein the filter unit is configured to filter the plurality of audio input channels depending on the tradeoff parameter.

19. A method for providing a speech probability estimation, comprising:
- estimating speech probability information indicating a first probability on whether a sound field comprises speech or on whether the sound field does not comprise speech, andoutputting the speech probability estimation depending on the speech probability information,wherein estimating the first speech probability information is based on at least spatial information about the sound field or spatial information on the scene.

20. A computer program for implementing the method of claim 19 when being executed on a computer or signal processor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forsching E.V.
Original Assignee
Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forsching E.V.
Inventors
TASESKA, Maja, HABETS, EMANUEL

Granted Patent

US 9,633,651 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/14   using statistical models, e...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0208   Noise filtering

G10L 21/0264   characterised by the type o...

G10L 25/78   Detection of presence or ab...

APPARATUS AND METHOD FOR PROVIDING AN INFORMED MULTICHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

APPARATUS AND METHOD FOR PROVIDING AN INFORMED MULTICHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links