MULTI-MODE AUDIO CODEC AND CELP CODING ADAPTED THEREFORE
First Claim
1. A multi-mode audio decoder for providing a decoded representation of audio content on the basis of an encoded bitstream, the multi-mode audio decoder configured todecode a global gain value per frame of the encoded bitstream, wherein a first subset of the frames being coded in a first coding mode and a second subset of the frames being coded in a second coding mode, with each frame of the second subset being composed of more than one sub-frames,decode, per sub-frame of at least a subset of the sub-frames of the second subset of frames, a corresponding bitstream element differentially to the global gain value of the respective frame, andcomplete decoding the bitstream using the global gain value and the corresponding bitstream element in decoding the sub-frames of the at least subset of the sub-frames of the second subset of frames and the global gain value in decoding the first subset of frames,wherein the multi-mode audio decoder is configured such that a change of the global gain value of the frames within the encoded bitstream results in an adjustment of an output level of the decoded representation of the audio content.
1 Assignment
0 Petitions
Accused Products
Abstract
In an embodiment, bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of the global gain value results in an adjustment of an output level of the decoded representation of the audio content. Concurrently, the differential coding saves bits. Even further, the differential coding enables the lowering of the burden of globally adjusting the gain of an encoded bitstream. In another embodiment, a global gain control across CELP coded frames and transform coded frames is achieved by co-controlling the gain of the codebook excitation of the CELP codec, along with a level of the transform or inverse transform of the transform coded frames. In another embodiment, the gain value determination in CELP coding is performed in the weighted domain of the excitation signal.
-
Citations
33 Claims
-
1. A multi-mode audio decoder for providing a decoded representation of audio content on the basis of an encoded bitstream, the multi-mode audio decoder configured to
decode a global gain value per frame of the encoded bitstream, wherein a first subset of the frames being coded in a first coding mode and a second subset of the frames being coded in a second coding mode, with each frame of the second subset being composed of more than one sub-frames, decode, per sub-frame of at least a subset of the sub-frames of the second subset of frames, a corresponding bitstream element differentially to the global gain value of the respective frame, and complete decoding the bitstream using the global gain value and the corresponding bitstream element in decoding the sub-frames of the at least subset of the sub-frames of the second subset of frames and the global gain value in decoding the first subset of frames, wherein the multi-mode audio decoder is configured such that a change of the global gain value of the frames within the encoded bitstream results in an adjustment of an output level of the decoded representation of the audio content.
-
8. A multi-mode audio decoder for providing a decoded representation of an audio content on the basis of an encoded bitstream, a first subset of frames of which is CELP coded and a second subset of frames of which is transform coded, the multi-mode audio decoder comprising:
-
a CELP decoder configured to decode a current frame of the first subset, the CELP decoder comprising; an excitation generator configured to generate a current excitation of the current frame of the first subset by constructing an codebook excitation based on a past excitation and an codebook index of the current frame of the first subset within the encoded bitstream, and setting a gain of the codebook excitation based on a global gain value within the encoded bitstream; and a linear prediction synthesis filter configured to filter the current excitation based on linear prediction filter coefficients for the current frame of the first subset within the encoded bitstream; a transform decoder configured to decode a current frame of the second subset by constructing spectral information for the current frame of the second subset from the encoded bitstream and performing a spectral-to-time-domain transformation onto the spectral information to acquire a time-domain signal such that a level of the time-domain signal depends on the global gain value. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A CELP decoder comprising:
-
an excitation generator configured to generate a current excitation for a current frame of a bitstream by constructing an adaptive codebook excitation based on a past excitation and an adaptive codebook index for the current frame within the bitstream; constructing an innovation codebook excitation based on an innovation codebook index for the current frame within the bitstream; computing an estimate of an energy of the innovation codebook excitation spectrally weighted by a weighted linear prediction synthesis filter constructed from linear prediction filter coefficients within the bitstream; setting a gain of the innovation codebook excitation based on a ratio between a global gain value within the bitstream and the estimated energy; and combining the adaptive codebook excitation and the innovation codebook excitation to achieve the current excitation; and a linear prediction synthesis filter configured to filter the current excitation based on the linear prediction filter coefficients. - View Dependent Claims (15, 16, 17, 18)
-
-
20. A multi-mode audio encoder configured to encode an audio content into an encoded bitstream with encoding a first subset of frames in a first coding mode and a second subset of frames in a second coding mode, wherein the second subset of frames is respectively composed of one or more sub-frames, wherein the multi-mode audio encoder is configured to determine and encode a global gain value per frame, and determine and encode, per sub-frames of at least a subset of the sub-frames of the second subset, a corresponding bitstream element differentially to the global gain value of the respective frame, wherein the multi-mode audio encoder is configured such that a change of the global gain value of the frames within the encoded bitstream results in an adjustment of an output level of a decoded representation of the audio content at the decoding side.
-
21. A multi-mode audio encoder for encoding an audio content into an encoded bitstream by CELP encoding a first subset of frames of the audio content and transform encoding a second subset of the frames, the multi-mode audio encoder comprising:
-
a CELP encoder configured to encode a current frame of the first subset, the CELP encoder comprising a linear prediction analyzer configured to generate linear prediction filter coefficients for the current frame of the first subset and encode same into the encoded bitstream; and an excitation generator configured to determine a current excitation of the current frame of the first subset, which, when filtered by a linear prediction synthesis filter based on the linear prediction filter coefficients within the encoded bitstream, recovers the current frame of the first subset, defined by a past excitation and a codebook index for the current frame of the first subset and encoding the codebook index into the encoded bitstream; and a transform encoder configured to encode a current frame of the second subset by performing a time-to-spectral-domain transformation onto a time-domain signal for the current frame of the second subset to acquire spectral information and encode the spectral information into the encoded bitstream, wherein the multi-mode audio encoder is configured to encode a global gain value into the encoded bitstream, the global gain value depending on an energy of a version of the audio content of the current frame of the first subset, filtered with the linear prediction analysis filter depending on the linear prediction coefficients, or an energy of the time-domain signal.
-
-
22. A CELP encoder comprising
a linear prediction analyzer configured to generate linear prediction filter coefficients for a current frame of an audio content and encode the linear prediction filter coefficients into a bitstream; -
an excitation generator configured to determine a current excitation of the current frame as a combination of an adaptive codebook excitation and an innovation codebook excitation, which, when filtered by a linear prediction synthesis filter based on the linear prediction filter coefficients, recovers the current frame, by constructing the adaptive codebook excitation defined by a past excitation and an adaptive codebook index for the current frame and encoding the adaptive codebook index into the bitstream; and constructing the innovation codebook excitation defined by an innovation codebook index for the current frame and encoding the innovation codebook index into the bitstream; and an energy determiner configured to determine an energy of a version of the audio content of the current frame filtered a weighting filter, to acquire a global gain value and encoding the global gain value into the bitstream, the weighting filter construed from the linear prediction filter coefficients. - View Dependent Claims (23, 24, 25, 26)
-
-
27. A multi-mode audio decoding method for providing a decoded representation of audio content on the basis of an encoded bitstream, the method comprising
decoding a global gain value per frame of the encoded bitstream, wherein a first subset of the frames being coded in a first coding mode and a second subset of the frames being coded in a second coding mode, with each frame of the second subset being composed of more than one sub-frames, decoding, per sub-frame of at least a subset of the sub-frames of the second subset of frames, a corresponding bitstream element differentially to the global gain value of the respective frame, and completing decoding the bitstream using the global gain value and the corresponding bitstream element in decoding the sub-frames of the at least subset of the sub-frames of the second subset of frames and the global gain value in decoding the first subset of frames, wherein the multi-mode audio decoding method is performed such that a change of the global gain value of the frames within the encoded bitstream results in an adjustment of an output level of the decoded representation of the audio content.
-
28. A multi-mode audio decoding method for providing a decoded representation of an audio content on the basis of an encoded bitstream, a first subset of frames of which is CELP coded and a second subset of frames of which is transform coded, the method comprising:
-
CELP decoding a current frame of the first subset, the CELP decoding comprising; generating a current excitation of the current frame of the first subset by constructing an codebook excitation based on a past excitation and an codebook index of the current frame of the first subset within the encoded bitstream, and setting a gain of the codebook excitation based on a global gain value within the encoded bitstream; and filtering the current excitation based on linear prediction filter coefficients for the current frame of the first subset within the encoded bitstream; transform decoding a current frame of the second subset by constructing spectral information for the current frame of the second subset from the encoded bitstream and performing a spectral-to-time-domain transformation onto the spectral information to acquire a time-domain signal such that a level of the time-domain signal depends on the global gain value.
-
-
29. A CELP decoding method comprising:
-
generating a current excitation for a current frame of a bitstream by constructing an adaptive codebook excitation based on a past excitation and an adaptive codebook index for the current frame within the bitstream; constructing an innovation codebook excitation based on an innovation codebook index for the current frame within the bitstream; computing an estimate of an energy of the innovation codebook excitation spectrally weighted by a weighted linear prediction synthesis filter constructed from linear prediction filter coefficients within the bitstream; setting a gain of the innovation codebook excitation based on a ratio between a global gain value within the bitstream and the estimated energy; and combining the adaptive codebook excitation and the innovation codebook excitation to achieve the current excitation; and filtering the current excitation based on the linear prediction filter coefficients by a linear prediction synthesis filter.
-
-
30. A multi-mode audio encoding method comprising encoding an audio content into an encoded bitstream with encoding a first subset of frames in a first coding mode and a second subset of frames in a second coding mode, wherein the second subset of frames is respectively composed of one or more sub-frames, wherein the multi-mode audio encoding method further comprises determining and encoding a global gain value per frame, and determine and encode, per sub-frames of at least a subset of the sub-frames of the second subset, a corresponding bitstream element differentially to the global gain value of the respective frame, wherein the multi-mode audio encoding method is performed such that a change of the global gain value of the frames within the encoded bitstream results in an adjustment of an output level of a decoded representation of the audio content at the decoding side.
-
31. A multi-mode audio encoding method for encoding an audio content into an encoded bitstream by CELP encoding a first subset of frames of the audio content and transform encoding a second subset of the frames, the multi-mode audio encoding method comprising:
-
encoding a current frame of the first subset, the CELP encoder comprising performing linear prediction analysis to generate linear prediction filter coefficients for the current frame of the first subset and encode same into the encoded bitstream; and determining a current excitation of the current frame of the first subset, which, when filtered by a linear prediction synthesis filter based on the linear prediction filter coefficients within the encoded bitstream, recovers the current frame of the first subset, defined by a past excitation and a codebook index for the current frame of the first subset and encoding the codebook index into the encoded bitstream; and encoding a current frame of the second subset by performing a time-to-spectral-domain transformation onto a time-domain signal for the current frame of the second subset to acquire spectral information and encode the spectral information into the encoded bitstream, wherein the multi-mode audio encoding method further comprises encoding a global gain value into the encoded bitstream, the global gain value depending on an energy of a version of the audio content of the current frame of the first subset, filtered with the linear prediction analysis filter depending on the linear prediction coefficients, or an energy of the time-domain signal.
-
-
32. A CELP encoding method comprising
performing linear prediction analysis to generate linear prediction filter coefficients for a current frame of an audio content and encode the linear prediction filter coefficients into a bitstream; -
determining a current excitation of the current frame as a combination of an adaptive codebook excitation and an innovation codebook excitation, which, when filtered by a linear prediction synthesis filter based on the linear prediction filter coefficients, recovers the current frame, by constructing the adaptive codebook excitation defined by a past excitation and an adaptive codebook index for the current frame and encoding the adaptive codebook index into the bitstream; and constructing the innovation codebook excitation defined by an innovation codebook index for the current frame and encoding the innovation codebook index into the bitstream; and determining an energy of a version of the audio content of the current frame filtered a weighting filter, to acquire a global gain value and encoding the global gain value into the bitstream, the weighting filter construed from the linear prediction filter coefficients.
-
Specification