Adaptive noise suppression for super wideband music
First Claim
1. A device configured to provide voice and data communications, the device comprising:
- one or more processors configured to;
classify primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data;
classify secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data;
obtain a proximity signal that determines the device'"'"'s relative position to a user;
obtain an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal, the primary microphone classification, and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and secondary input audio data includes speech signals, music signals, and noise signals and the audio context indicating a valid speech context or a valid music context;
apply, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression unit includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted in the primary input audio data and the secondary input audio data when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; and
bandwidth compress, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data to generate at least one audio encoder packet;
a memory, electrically coupled to the one or more processors, configured to store the at least one audio encoder packet; and
a transmitter configured to transmit the at least one audio encoder packet.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described for performing adaptive noise suppression to improve handling of both speech signals and music signals at least up to super wideband (SWB) bandwidths. The techniques include identifying a context or environment in which audio data is captured, and adaptively changing a level of noise suppression applied to the audio data prior to bandwidth compressing (e.g., encoding) based on the context. For a valid speech context, an audio pre-processor may set a first level of noise suppression that is relatively aggressive in order to suppress noise (including music) in the speech signals. For a valid music context, the audio pre-processor may set a second level of noise suppression that is less aggressive in order to leave the music signals undistorted. In this way, a vocoder at a transmitter side wireless communication device may properly encode both speech and music signals with minimal distortions.
76 Citations
24 Claims
-
1. A device configured to provide voice and data communications, the device comprising:
-
one or more processors configured to; classify primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data; classify secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data; obtain a proximity signal that determines the device'"'"'s relative position to a user; obtain an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal, the primary microphone classification, and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and secondary input audio data includes speech signals, music signals, and noise signals and the audio context indicating a valid speech context or a valid music context; apply, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression unit includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted in the primary input audio data and the secondary input audio data when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; and bandwidth compress, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data to generate at least one audio encoder packet; a memory, electrically coupled to the one or more processors, configured to store the at least one audio encoder packet; and a transmitter configured to transmit the at least one audio encoder packet. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 22)
-
-
14. An apparatus configured to perform noise suppression comprising:
-
means for classifying primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data; means for classifying secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data; means for obtain a proximity signal that determines the device'"'"'s relative position to a user; means for determining an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal and the primary microphone classification and the secondary microphone classification output by the classifier, prior to application of a variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the primary input audio data and the secondary input audio data includes speech signals, music signals, and noise signals, and the audio context indicating a valid speech context or a valid music context; means for applying, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of the noise suppression includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted, in the primary input audio data and the secondary input audio data, when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; means for bandwidth compressing the noise suppressed version of the primary input audio data and the secondary input audio data, based on the primary microphone classification and the secondary microphone classification output by the classifier, to generate at least one audio encoder packet; and means for transmitting the at least one audio encoder packet. - View Dependent Claims (15, 16, 17)
-
-
18. A method used in voice and data communications comprising:
-
classifying primary input audio data, by a classifier, from a primary microphone and output a primary microphone classification of the primary input audio data; classifying secondary input audio data, by the classifier, from a secondary microphone and output a secondary microphone classification of the secondary input audio data; obtaining a proximity signal that determines whether the device'"'"'s proximity to the user'"'"'s face; obtaining an audio context, with a control unit, of the primary input audio data and the secondary input audio data, wherein the control unit combines the proximity signal and the primary microphone classification and the secondary microphone classification output by the classifier prior to application of noise suppression to the primary input audio data and the secondary input audio data, wherein the input audio data includes speech signals, music signals, and noise signals, and the audio context indicating a valid speech context or a valid music context; applying, with a noise suppression unit, the variable level of noise suppression to the primary input audio data and the secondary input audio data, wherein the variable level of noise suppression includes a first level of noise suppression when the speech signals are louder than the music signals, and a second level of noise suppression that is lower than the first level of the noise suppression to leave music signals undistorted, in the primary input audio data and secondary input audio data, when the music signals are louder than the speech signals, and the variable noise suppression is applied to the primary input audio data and the secondary input audio data prior to bandwidth compression, by an audio encoder coupled to the noise suppression unit, to generate a noise suppressed version of the primary input audio data and the secondary input audio data; bandwidth compressing, with the audio encoder, the noise suppressed version of the primary input audio data and the secondary input audio data, based on the audio context, to generate at least one audio encoder packet; and transmitting the at least one audio encoder packet from a source device to a destination device. - View Dependent Claims (19, 20, 21, 23, 24)
-
Specification