Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method
First Claim
1. A speech or audio coding apparatus comprising:
- a transformation section that transforms an input signal from a time domain to a frequency domain to obtain a frequency spectrum comprising spectral coefficients;
an estimation section that estimates an energy envelope which represents an energy level for each subband of a plurality of subbands achieved by splitting the frequency spectrum of the input signal, each subband having at least two spectral coefficients;
a quantization section that quantizes the energy envelope to obtain a quantized energy envelope;
a group determining section that splits the quantized energy envelopes into a plurality of groups, each group having a plurality of at least two subbands;
a first bit allocation section that allocates bits to each group of the plurality of groups to obtain a group-specific number of bits for each group of the plurality of groups;
a second bit allocation section that allocates, for each group of the plurality of groups, the group-specific number of bits allocated to a respective group of the plurality of groups to the plurality of subbands belonging to the respective group; and
a coding section that encodes, for each subband of the plurality of subbands, the spectral coefficients included in the respective subband using bits allocated to the respective subbands.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are a voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method that efficiently perform bit distribution and improve sound quality. Dominant frequency band identification unit identifies a dominant frequency band having a norm factor value that is the maximum value within the spectrum of an input voice audio signal. Dominant group determination units and non-dominant group determination unit group all sub-bands into a dominant group that contains the dominant frequency band and a non-dominant group that contains no dominant frequency band. Group bit distribution unit distributes bits to each group on the basis of the energy and norm variance of each group. Sub-band bit distribution unit redistributes the bits that have been distributed to each group to each sub-band in accordance with the ratio of the norm to the energy of the groups.
36 Citations
25 Claims
-
1. A speech or audio coding apparatus comprising:
-
a transformation section that transforms an input signal from a time domain to a frequency domain to obtain a frequency spectrum comprising spectral coefficients; an estimation section that estimates an energy envelope which represents an energy level for each subband of a plurality of subbands achieved by splitting the frequency spectrum of the input signal, each subband having at least two spectral coefficients; a quantization section that quantizes the energy envelope to obtain a quantized energy envelope; a group determining section that splits the quantized energy envelopes into a plurality of groups, each group having a plurality of at least two subbands; a first bit allocation section that allocates bits to each group of the plurality of groups to obtain a group-specific number of bits for each group of the plurality of groups; a second bit allocation section that allocates, for each group of the plurality of groups, the group-specific number of bits allocated to a respective group of the plurality of groups to the plurality of subbands belonging to the respective group; and a coding section that encodes, for each subband of the plurality of subbands, the spectral coefficients included in the respective subband using bits allocated to the respective subbands. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A speech or audio decoding apparatus, comprising:
-
a de-quantization section that de-quantizes a quantized spectral envelope to obtain a dequantized spectral envelope; a group determining section that groups splits the quantized spectral envelope into a plurality of groups each group having a plurality of at least two subbands; a first bit allocation section that allocates bits to each group of the plurality of groups to obtain a group-specific number of bits for each group of the plurality of groups; a second bit allocation section that allocates, for each group of the plurality of groups, the group-specific number of bits allocated to a respective group of the plurality of groups to the plurality of subbands belonging to the respective group; a decoding section that decodes, for each subband of the plurality of subbands, encoded spectral coefficients included in a respective subband of a speech or audio signal using the bits allocated to the respective subband to obtain a decoded frequency spectrum; an envelope shaping section that applies the de-quantized spectral envelope to the decoded frequency spectrum to obtain a shaped spectrum; and an inverse transformation section that inversely transforms the shaped spectrum from a frequency domain to a time domain. - View Dependent Claims (21, 22, 23)
-
-
24. A speech or audio coding method, comprising:
-
transforming an input signal from a time domain to a frequency domain to obtain a frequency spectrum comprising spectral coefficients; estimating an energy envelope that represents an energy level for each subband of a plurality of subbands achieved by splitting the frequency spectrum of the input signal, each subband having at least two spectral coefficients; quantizing the energy envelope to obtain a quantized energy envelope; splitting the quantized energy envelopes into a plurality of groups, each group having a plurality of at least two subbands; allocating, for each group of the plurality of groups, bits to each group of the plurality of groups to obtain a group-specific number of bits for each group of the plurality of groups; allocating, for each group of the plurality of groups, the group-specific number of bits allocated to a respective group of the plurality of groups to the plurality of subbands belonging to the respective group; and encoding, for each subband of the plurality of subbands, the spectral coefficients included in the respective subband using bits allocated to the respective subband.
-
-
25. A speech or audio decoding method, comprising:
-
de-quantizing a quantized spectral envelope to obtain a dequantized spectral envelope; splitting the quantized spectral envelope into a plurality of groups each group having a plurality of at least two subbands; allocating bits to each group of the plurality of groups to obtain a group-specific number of bits for each group of the plurality of groups; allocating, for each group of the plurality of groups, the group-specific number of bits allocated to a respective group of the plurality of groups to the plurality of subbands belonging to the respective group; decoding, for each subband of the plurality of subbands, encoded spectral coefficients included in a respective subband of a speech/audio signal using the bits allocated to the respective subband to obtain a decoded frequency spectrum; applying the de-quantized spectral envelope to the decoded frequency spectrum to obtain a shaped spectrum; and inversely transforming the shaped spectrum from a frequency domain to a time domain.
-
Specification