Audio coding with gain profile extraction and transmission for speech enhancement at the decoder

US 9,495,970 B2
Filed: 09/11/2013
Issued: 11/15/2016
Est. Priority Date: 09/21/2012
Status: Active Grant

First Claim

Patent Images

1. An audio encoding system for producing, based on an audio signal, a gain profile to be distributed with said audio signal, the gain profile comprising a time-variable voice activity gain and a time-variable and frequency-variable cleaning gain, wherein the audio encoding system comprises:

a voice activity detector adapted to determine the voice activity gain by at least determining voice activity in the audio signal; and

a noise estimator adapted to determine the cleaning gain by at least estimating noise in said audio signal,wherein the cleaning gain is separable from the voice activity gain in the gain profile.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application.

Citations

20 Claims

1. An audio encoding system for producing, based on an audio signal, a gain profile to be distributed with said audio signal, the gain profile comprising a time-variable voice activity gain and a time-variable and frequency-variable cleaning gain, wherein the audio encoding system comprises:
- a voice activity detector adapted to determine the voice activity gain by at least determining voice activity in the audio signal; and
  
  a noise estimator adapted to determine the cleaning gain by at least estimating noise in said audio signal,wherein the cleaning gain is separable from the voice activity gain in the gain profile.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The audio encoding system of claim 1, wherein the noise estimator is adapted to determine at least two components of the cleaning gain based on said audio signal, said components being adapted for use in different playback channel configurations.
  - 3. The audio encoding system of claim 1, wherein the noise estimator is adapted to determine a plurality of components of the cleaning gain based on said audio signal, the plurality of components comprising at least one of:
    - a reverb gain adapted to attenuate room acoustics audio content; and
      
      a sibilance gain adapted to attenuate sibilance audio content.
  - 4. The audio encoding system of claim 1, wherein the noise estimator is adapted to receive, from the voice activity detector, information about voice activity in said audio signal, and to determine the cleaning gain based on said information.
  - 5. The audio encoding system of claim 1, wherein the gain profile further comprises a time-variable level gain,the audio encoding system further comprising a loudness analyzer adapted to determine loudness of a component of interest in said audio signal and to determine the level gain based on the determined loudness,wherein the level gain is separable from the cleaning gain and the voice activity gain in the gain profile.
  - 6. The audio encoding system of claim 1, wherein the voice activity detector is adapted to encode the voice activity gain using coding indices referring to one or more predefined time sequences of gains.
  - 7. The audio encoding system of claim 1, wherein the audio signal is segmented into time frames, and wherein at least one of the voice activity detector and the noise estimator is adapted to encode its respective determined gain on a per time frame basis, the encoding of a time frame being independent of previous time frames.
  - 8. The audio encoding system of claim 1, further comprising a multiplexer adapted to encode the determined gains and said audio signal in one bitstream.

9. An audio encoding method for producing, based on an audio signal, a gain profile to be distributed with said audio signal, the gain profile comprising a time-variable voice activity gain and a time-variable and frequency-variable cleaning gain, wherein the audio encoding method comprises:
- determining voice activity in said audio signal;
  
  assigning a value to the voice activity gain based the determined voice activity;
  
  estimating noise in said audio signal; and
  
  assigning a value to the cleaning gain based on the estimated noise,wherein the cleaning gain is separable from the voice activity gain in the gain profile.
- View Dependent Claims (10)
- - 10. A computer program product comprising a non transitory computer-readable medium with instructions for causing a computer to execute the method of claim 9.

11. A mixing system for combining a plurality of received pairs of an audio signal and an associated gain profile, each of said gain profiles comprising a time-variable voice activity gain and a time-variable and frequency-variable cleaning gain, wherein the mixing system comprises:
- a decoder adapted to derive, from each of the gain profiles, a representation of the audio signal, the voice activity gain and the cleaning gain, wherein the voice activity gain is separable from the cleaning gain in the gain profile;
  
  a gain combining stage adapted to;
  
  determine a combined voice activity gain by combining the derived voice activity gains using a first combining rule, anddetermine a combined cleaning gain by combining the derived cleaning gains by a second combining rule different from the first combining rule; and
  
  a mixing stage adapted to combine one or more of the audio signals into a combined audio signal to be distributed with the combined voice activity gain and the combined cleaning gain.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The mixing system of claim 11, wherein the first combining rule includes determining the combined voice activity gain by assigning a combined voice activity gain achieving an attenuating amount equal to the lowest attenuating amount of the derived voice activity gains.
  - 13. The mixing system of claim 11, wherein the second combining rule includes determining the combined cleaning gain by assigning, for each frequency subband, a combined cleaning gain achieving an attenuating amount equal to the highest attenuating amount of the derived cleaning gains in that frequency subband.
  - 14. The mixing system of claim 11, wherein the second combining rule includes determining the combined cleaning gain by assigning, for each frequency subband, a power-weighted mean of the derived cleaning gains for that frequency subband.
  - 15. The mixing system of claim 11, wherein:
    - each of said cleaning gains includes a plurality of components comprising a sibilance gain adapted to attenuate sibilance audio content;
      
      the decoder is adapted to further derive the sibilance gain; and
      
      the gain combining stage is adapted to determine the combined sibilance gain achieving an attenuating amount equal to the highest attenuating amount of the derived sibilance gains.
  - 16. The mixing system of claim 11, wherein:
    - the derived voice activity gains are encoded using coding indices referring to one or more predefined time sequences of gains; and
      
      the gain combining stage is adapted to determine the combined voice activity gain by assigning a coding index based on coding indices of the derived cleaning gains.
  - 17. The mixing system of claim 11, wherein the decoder is further adapted to derive, from each of the audio signals, decomposition parameters and a plurality of rotated audio signals, andwherein the mixing stage further comprises:
    - an adaptive rotation inversion stage configured to discretely decode each of the pluralities of rotated audio signals into a plurality of de-rotated audio signals, respectively, based on the respective decomposition parameters;
      
      a mixer adapted to provide a plurality of combined audio signals by additively mixing respective signals from the different pluralities of de-rotated signals;
      
      a spatial analyzer configured receive the plurality of combined audio signals, and to output, based thereon, combined decomposition parameters; and
      
      an adaptive rotation stage configured to receive the plurality of combined audio signals and to output a plurality of combined rotated audio signals obtained by an adaptive energy-compacting orthogonal transformation, wherein quantitative properties of the orthogonal transformation are determined by the combined decomposition parameters.
  - 18. The mixing system of claim 11, wherein:
    - each of said gain profiles further comprises a time-variable level gain;
      
      the decoder is adapted to further derive the level gain; and
      
      the gain combining stage is adapted to determine the combined level gain by assigning a combined level gain achieving an attenuating amount equal to the highest attenuating amount of the derived level gains.
  - 19. The mixing system of claim 18, further comprising a leveling stage arranged upstream of the mixing stage and adapted to rescale, based on the derived level gains, at least some of the derived audio signals.
  - 20. The mixing system of claim 11, further comprising a multiplexer adapted to encode the determined combined gains and combined audio signal as one bitstream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby International AB (Dolby Laboratories Incorporated), Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Dickins, Glenn, Purnhagen, Heiko, Samuelsson, Leif Jonas
Primary Examiner(s)
AZAD, ABUL K

Application Number

US14/427,908
Publication Number

US 20150356978A1
Time in Patent Office

1,161 Days
Field of Search

704200-230, 704500-504
US Class Current

1/1
CPC Class Codes

G10L 19/008   Multichannel audio signal c...

G10L 19/012   Comfort noise or silence co...

G10L 19/02   using spectral analysis, e....

G10L 19/0208   Subband vocoders

G10L 19/032   Quantisation or dequantisat...

G10L 19/22   Mode decision, i.e. based o...

G10L 19/24   Variable rate codecs, e.g. ...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0208   Noise filtering

G10L 21/0216   characterised by the method...

H04M 3/56   Arrangements for connecting...

Audio coding with gain profile extraction and transmission for speech enhancement at the decoder

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Audio coding with gain profile extraction and transmission for speech enhancement at the decoder

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links