Adaptive noise suppression for super wideband music

US 10,186,276 B2
Filed: 09/25/2015
Issued: 01/22/2019
Est. Priority Date: 09/25/2015
Status: Active Grant

First Claim

Patent Images

1. A device configured to provide voice and data communications, the device comprising:

one or more processors configured to;

classify primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data;

classify secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data;

obtain a proximity signal that determines the device'"'"'s relative position to a user;

obtain an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal, the primary microphone classification, and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and secondary input audio data includes speech signals, music signals, and noise signals and the audio context indicating a valid speech context or a valid music context;

apply, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression unit includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted in the primary input audio data and the secondary input audio data when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; and

bandwidth compress, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data to generate at least one audio encoder packet;

a memory, electrically coupled to the one or more processors, configured to store the at least one audio encoder packet; and

a transmitter configured to transmit the at least one audio encoder packet.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques are described for performing adaptive noise suppression to improve handling of both speech signals and music signals at least up to super wideband (SWB) bandwidths. The techniques include identifying a context or environment in which audio data is captured, and adaptively changing a level of noise suppression applied to the audio data prior to bandwidth compressing (e.g., encoding) based on the context. For a valid speech context, an audio pre-processor may set a first level of noise suppression that is relatively aggressive in order to suppress noise (including music) in the speech signals. For a valid music context, the audio pre-processor may set a second level of noise suppression that is less aggressive in order to leave the music signals undistorted. In this way, a vocoder at a transmitter side wireless communication device may properly encode both speech and music signals with minimal distortions.

76 Citations

View as Search Results

24 Claims

1. A device configured to provide voice and data communications, the device comprising:
- one or more processors configured to;
  
  classify primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data;
  
  classify secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data;
  
  obtain a proximity signal that determines the device'"'"'s relative position to a user;
  
  obtain an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal, the primary microphone classification, and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and secondary input audio data includes speech signals, music signals, and noise signals and the audio context indicating a valid speech context or a valid music context;
  
  apply, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression unit includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted in the primary input audio data and the secondary input audio data when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; and
  
  bandwidth compress, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data to generate at least one audio encoder packet;
  
  a memory, electrically coupled to the one or more processors, configured to store the at least one audio encoder packet; and
  
  a transmitter configured to transmit the at least one audio encoder packet.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 22)
- - 2. The device of claim 1, further comprising the primary microphone and the secondary microphone.
  - 3. The device of claim 1 wherein a first level of attenuation of the primary input audio data and the secondary input audio data when the audio context of the input audio data indicates the valid speech context in a first audio frame is within fifteen percent of a second level of attenuation of the primary input audio data and the secondary audio data when the audio context of the primary input audio data and the secondary input audio data indicates the valid music context during a second audio frame.
  - 4. The device of claim 3, wherein the first audio frame is within fifty audio frames before or after the second audio frame.
  - 5. The device of claim 1, wherein the classifier is configured to provide at least two classification outputs of the primary input audio data and the secondary input audio data, and the at least two classification outputs are the primary microphone classification and the secondary microphone classification.
  - 6. The device of claim 5, wherein the classifier is integrated into the one or more processors.
  - 7. The device of claim 5, where one of the at least two classification outputs is the valid music context, and another one of the at least two classification outputs is a valid speech context.
  - 8. The device of claim 7, wherein the one or more processors configured to apply the noise suppression are further configured to adjust one gain value in a noise suppressor of the device based on the one of the at least two classification outputs being the valid music context.
  - 9. The device of claim 7, wherein the one or more processors configured to apply the variable level of noise suppression are further configured to adjust one gain value in a noise suppressor of the device based on the one of the at least two classification outputs being the valid speech context.
  - 10. The device of claim 1, further comprising a control unit integrated into the one or more processors configured to determine the audio context of the primary input audio data and the secondary input audio data, when the one or more processors are configured to obtain the audio context of the primary input audio data and the secondary input audio data.
  - 11. The device of claim of claim 10, further comprising a proximity sensor configured to output the proximity signal and aid the control unit to determine the audio context of the primary input audio data and the secondary input audio data.
  - 12. The device of claim 1, wherein obtaining of the audio context is further improved based on the control unit receiving input from one or more external sensors in a wearable device, the wearable device in communication with the source device.
  - 13. The device of claim 1, further comprising at least one speaker configured to render an output of an audio decoder configured to decode the at least one audio encoder packet from a destination device.
  - 22. The method of claim 1, further comprising classifying the primary input audio data and the secondary input audio data as music at least eighty percent of the time that music is present with speech.

14. An apparatus configured to perform noise suppression comprising:
- means for classifying primary input audio data, by a classifier, from a primary microphone andoutput a primary microphone classification of the primary input audio data;
  
  means for classifying secondary input audio data, by the classifier, from a secondarymicrophone and output a secondary microphone classification of the secondary input audio data;
  
  means for obtain a proximity signal that determines the device'"'"'s relative position to a user;
  
  means for determining an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal and the primary microphone classification and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and the secondary input audio data includes speech signals, music signals, and noise signals, and the audio context indicating a valid speech context or a valid music context;
  
  means for applying, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted, in the primary input audio data and the secondary input audio data, when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data;
  
  means for bandwidth compressing the noise suppressed version of the primary input audio data and the secondary input audio data, based on the primary microphone classification and the secondary microphone classification output by the classifier, to generate at least one audio encoder packet; and
  
  means for transmitting the at least one audio encoder packet.
- View Dependent Claims (15, 16, 17)
- - 15. The apparatus of claim 14, wherein the apparatus further comprises:
    - means for determining the audio context of the primary input audio data and the secondary input audio data is based on means for capturing a first portion of the primary input audio data from the primary microphone, wherein the primary microphone is positioned at a front of the device, and means for capturing a second portion of the secondary input audio data from the secondary microphone, wherein the secondary microphone is positioned at a back of the device.
  - 16. The apparatus of claim 15, wherein the apparatus further comprises:
    - means for obtaining a user override signal for the means for applying the second level of noise suppression to the primary input audio data and the secondary input audio data.
  - 17. The apparatus of claim 14, wherein the apparatus further comprises:
    - means for communicating with a different apparatus, wherein the different apparatus is wearable device or a karaoke machine.

18. A method used in voice and data communications comprising:
- classifying primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data;
  
  classifying secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data;
  
  obtaining a proximity signal that determines whether the device'"'"'s proximity to the user'"'"'s face;
  
  obtaining an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal and the primary microphone classification and the secondary microphone classification output by the classifier prior to application of noise suppression to the primary input audio data and the secondary input audio data, wherein the input audio data includes speech signals, music signals, and noise signals, and the audio context indicating a valid speech context or a valid music context;
  
  applying, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of noise suppression includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted, in the primary input audio data and secondary input audio data, when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data;
  
  bandwidth compressing, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data, based on the audio context, to generate at least one audio encoder packet; and
  
  transmitting the at least one audio encoder packet from a source device to a destination device.
- View Dependent Claims (19, 20, 21, 23, 24)
- - 19. The method of claim 18, wherein the first level of noise suppression and the second level of noise suppression are different when the music signals are at the same level as the speech signals.
  - 20. The method of claim 18, wherein the first level of noise suppression of the primary input audio data and the secondary input audio data is applied when the user of the source device is talking at least 3 dB louder than the music playing in the background of the source device, and the second level of noise suppression of the primary input audio data and the secondary input audio data is applied when the music playing in the background of the source device is at least 3 dB louder than the talking of the user of the source device.
  - 21. The method of claim 18, wherein bandwidth compression of voice in the speech signals and music playing in the background, in the primary input audio data and the secondary input audio data provides at least 30% less distortion of the music playing in the background as compared to bandwidth compression of the voice in the speech signals and music playing in the background, in the primary input audio data and the secondary input audio data of the voice without obtaining the audio context of the primary input audio data and the secondary input audio data prior to application of noise suppression to the primary input and the secondary input audio data.
  - 23. The method of claim 18, wherein the obtaining of the audio context is further improved based on the control unit receiving input from one or more external sensors in a wearable device, the wearable device in communication with the source device.
  - 24. The method of claim 18, where the music context of the user of the source device comes from a karaoke machine.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Dewasurendra, Duminda Ashoka, Rajendran, Vivek, Subasingha, Subasingha Shaminda
Primary Examiner(s)
He, Jialong

Application Number

US14/865,885
Publication Number

US 20170092288A1
Time in Patent Office

1,215 Days
Field of Search
US Class Current
CPC Class Codes

G10L 19/20   using sound class specific ...

G10L 19/265   Pre-filtering, e.g. high fr...

G10L 2021/02087   the noise being separate sp...

G10L 21/0208   Noise filtering

G10L 25/81   for discriminating voice fr...

G10L 25/84   for discriminating voice fr...

H04R 1/08   Mouthpieces; Microphones; A...

H04R 2430/20   Processing of the output si...

Adaptive noise suppression for super wideband music

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

76 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Adaptive noise suppression for super wideband music

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

76 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links