Encoding device and encoding method, decoding device and decoding method, and program

US 9,842,603 B2
Filed: 08/14/2012
Issued: 12/12/2017
Est. Priority Date: 08/24/2011
Status: Active Grant

First Claim

Patent Images

1. An encoding device, comprising:

processing circuitry configured to perform a process including;

receiving an input audio signal;

generating a low frequency sub-band signal of a sub-band on a low frequency side of the input audio signal and a high frequency sub-band signal of a sub-band on a high frequency side of the input audio signal;

calculating a quasi-high frequency sub-band power that is an estimated value of a high frequency sub-band power of the high frequency sub-band signal based on the low frequency sub-band signal and a predetermined estimation coefficient;

calculating a number-of-sections determining feature amount by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed;

determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in a process target section including a plurality of frames of the input signal, based on the number-of-sections determining feature amount;

selecting the estimation coefficient of a frame that constitutes the continuous frame section from a plurality of estimation coefficients based on the quasi-high frequency sub-band power and the high frequency sub-band power in each continuous frame section obtained by dividing the process target section based on the determined number of continuous frame sections;

generating data for obtaining the estimation coefficient selected in a frame of each of the continuous frame sections constituting the process target section;

encoding a low frequency signal of the input signal to generate low frequency encoded data;

multiplexing the data and the low frequency encoded data to generate an output code string representative of the input audio signal; and

outputting the output code string.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present technology relates to an encoding device and an encoding method, a decoding device and a decoding method, and a program, configured to obtain a high quality audio with less encoding amount. A number-of-sections determining feature amount calculating circuit calculates a number-of-sections determining feature amount for determining the number of divisions to divide a process target section into continuous frame sections each including a frame for which the same estimation coefficient is selected, based on sub-band signals of a plurality of sub-bands constituting an input signal. A quasi-high frequency sub-band power difference calculating circuit determines the number of continuous frame sections in the process target section based on the number-of-sections determining feature amount, selects an estimation coefficient for obtaining a high frequency component of the input signal by estimation for each continuous frame section, and generates data including a coefficient index for obtaining the estimation coefficient. A high frequency encoding circuit encodes the obtained data, and generates high frequency encoded data. The present technology can be applied to an encoding device.

61 Citations

View as Search Results

18 Claims

1. An encoding device, comprising:
- processing circuitry configured to perform a process including;
  
  receiving an input audio signal;
  
  generating a low frequency sub-band signal of a sub-band on a low frequency side of the input audio signal and a high frequency sub-band signal of a sub-band on a high frequency side of the input audio signal;
  
  calculating a quasi-high frequency sub-band power that is an estimated value of a high frequency sub-band power of the high frequency sub-band signal based on the low frequency sub-band signal and a predetermined estimation coefficient;
  
  calculating a number-of-sections determining feature amount by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed;
  
  determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in a process target section including a plurality of frames of the input signal, based on the number-of-sections determining feature amount;
  
  selecting the estimation coefficient of a frame that constitutes the continuous frame section from a plurality of estimation coefficients based on the quasi-high frequency sub-band power and the high frequency sub-band power in each continuous frame section obtained by dividing the process target section based on the determined number of continuous frame sections;
  
  generating data for obtaining the estimation coefficient selected in a frame of each of the continuous frame sections constituting the process target section;
  
  encoding a low frequency signal of the input signal to generate low frequency encoded data;
  
  multiplexing the data and the low frequency encoded data to generate an output code string representative of the input audio signal; and
  
  outputting the output code string.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The encoding device according to claim 1, wherein the number-of-sections determining feature amount includes a feature amount indicating a temporal change of a sum of the high frequency sub-band power.
  - 3. The encoding device according to claim 1, wherein the number-of-sections determining feature amount includes a feature amount indicating a frequency profile of the input signal.
  - 4. The encoding device according to claim 1, wherein the number-of-sections determining feature amount includes a linear sum or a nonlinear sum of a plurality of feature amounts.
  - 5. The encoding device according to claim 1, further comprising the processing circuitry calculating, based on an evaluation value indicating an error between the quasi-high frequency sub-band power and the high frequency sub-band power in the frame calculated for each of the estimation coefficients, a sum of the evaluation value of each frame constituting the continuous frame section for each of the estimation coefficients, whereinthe selecting includes selecting the estimation coefficient of the frame of the continuous frame section based on the sum of the evaluation value calculated for each of the estimation coefficients.
  - 6. The encoding device according to claim 5, wherein each section obtained by equally dividing the process target section by the determined number of continuous frame sections is defined as the continuous frame section.
  - 7. The encoding device according to claim 5, wherein the selecting includes selecting the estimation coefficient of the frame of the continuous frame section based on the sum of the evaluation value for each combination of divisions of the process target section that can be taken when dividing the process target section by the determined number of continuous frame sections, identifying a combination with which the sum of the evaluation values of the selected estimation coefficients of all the frames constituting the process target section is minimized from among the combinations, and defining the estimation coefficient selected in each frame as the estimation coefficient of the corresponding frame in the identified combination.
  - 8. The encoding device according to claim 1, further comprising the processing circuitry encoding the data to generate high frequency encoded data, whereinthe multiplexing includes generating the output code string by multiplexing the high frequency encoded data and the low frequency encoded data.
  - 9. The encoding device according to claim 8, whereinthe determining includes calculating an encoding amount of the high frequency encoded data of the process target section based on the determined number of continuous frame sections, andthe low frequency encoding includes encoding the low frequency signal with an encoding amount determined from an encoding amount determined in advance for the process target section and the calculated encoding amount of the high frequency encoded data.

10. An encoding method, comprising:
- receiving, by processing circuitry, an input audio signal;
  
  generating, by the processing circuitry, a low frequency sub-band signal of a sub-band on a low frequency side of the input audio signal and a high frequency sub-band signal of a sub-band on a high frequency side of the input audio signal;
  
  calculating, by the processing circuitry, a quasi-high frequency sub-band power that is an estimated value of a high frequency sub-band power of the high frequency sub-band signal based on the low frequency sub-band signal and a predetermined estimation coefficient;
  
  calculating, by the processing circuitry, a number-of-sections determining feature amount by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed;
  
  determining, by the processing circuitry, the number of continuous frame sections including frames for which the same estimation coefficient is selected in a process target section including a plurality of frames of the input signal, based on the number-of-sections determining feature amount;
  
  selecting, by the processing circuitry, the estimation coefficient of a frame that constitutes the continuous frame section from a plurality of estimation coefficients based on the quasi-high frequency sub-band power and the high frequency sub-band power in each continuous frame section obtained by dividing the process target section based on the determined number of continuous frame sections;
  
  generating, by the processing circuitry, data for obtaining the estimation coefficient selected in a frame of each of the continuous frame sections constituting the process target section;
  
  generating, by the processing circuitry, low frequency encoded data by encoding a low frequency signal of the input signal;
  
  generating, by the processing circuitry, an output code string by multiplexing the data and the low frequency encoded data, the output code string being representative of the input audio signal; and
  
  outputting, by the processing circuitry, the output code string.

11. A computer-readable storage device encoded with computer-executable instructions that, when executed by processing circuitry, perform an encoding method comprising:
- receiving an input audio signal;
  
  generating a low frequency sub-band signal of a sub-band on a low frequency side of the input audio signal and a high frequency sub-band signal of a sub-band on a high frequency side of the input audio signal;
  
  calculating a quasi-high frequency sub-band power that is an estimated value of a high frequency sub-band power of the high frequency sub-band signal based on the low frequency sub-band signal and a predetermined estimation coefficient;
  
  calculating a number-of-sections determining feature amount by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed;
  
  determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in a process target section including a plurality of frames of the input signal, based on the number-of-sections determining feature amount;
  
  selecting the estimation coefficient of a frame that constitutes the continuous frame section from a plurality of estimation coefficients based on the quasi-high frequency sub-band power and the high frequency sub-band power in each continuous frame section obtained by dividing the process target section based on the determined number of continuous frame sections;
  
  generating data for obtaining the estimation coefficient selected in a frame of each of the continuous frame sections constituting the process target section;
  
  generating low frequency encoded data by encoding a low frequency signal of the input signal;
  
  generating an output code string by multiplexing the data and the low frequency encoded data, the output code string being representative of the input audio signal; and
  
  outputting the output code string.

12. A decoding device, comprising:
- processing circuitry configured to perform a process including;
  
  receiving an input code string representative of an audio signal;
  
  demultiplexing the input code string into data for obtaining an estimation coefficient selected in a frame of each continuous frame section constituting a process target section, which is generated based on a result of calculating an estimated value of a high frequency sub-band power of a high frequency sub-band signal of the audio signal based on a low frequency sub-band signal of the audio signal and a predetermined estimation coefficient, determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in the process target section including a plurality of frames of the audio signal based on a number-of-sections determining feature amount extracted from the audio signal, wherein the number-of-sections determining feature amount is calculated by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed, and selecting the estimation coefficient of a frame constituting the continuous frame section from a plurality of estimation coefficients based on the estimated value and the high frequency sub-band power in each of the continuous frame sections obtained by dividing the process target section based on the determined number of continuous frame sections, and low frequency encoded data obtained by encoding a low frequency signal of the input signal;
  
  decoding the low frequency encoded data to generate a low frequency signal;
  
  generating a high frequency signal based on the estimation coefficient obtained from the data and the low frequency signal obtained from the decoding;
  
  generating the audio signal based on the high frequency signal and the low frequency signal obtained from the decoding; and
  
  outputting the audio signal.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The decoding device according to claim 12, further comprising the processing circuitry decoding the data to obtain the estimation coefficient.
  - 14. The decoding device according to claim 13, whereinbased on an evaluation value indicating an error between the estimated value and the high frequency sub-band power in the frame calculated for each of the estimation coefficients, a sum of the evaluation value of each frame constituting the continuous frame section is calculated for each of the estimation coefficients, andbased on the sum of the evaluation value calculated for each of the estimation coefficients, the estimation coefficient of the frame of the continuous frame section is selected.
  - 15. The decoding device according to claim 14, wherein each section obtained by equally dividing the process target section by the determined number of continuous frame sections is defined as the continuous frame section.
  - 16. The decoding device according to claim 14, whereinthe estimation coefficient of the frame of the continuous frame section is selected based on the sum of the evaluation value for each combination of divisions of the process target section that can be taken when dividing the process target section by the determined number of continuous frame sections,a combination with which the sum of the evaluation values of the selected estimation coefficients of all the frames constituting the process target section is minimized is identified from among the combinations, andthe estimation coefficient selected in each frame is defined as the estimation coefficient of the corresponding frame in the identified combination.

17. A decoding method, comprising:
- receiving, by processing circuitry, an input code string representative of an audio signal;
  
  demultiplexing, by the processing circuitry, the input code string into data for obtaining an estimation coefficient selected in a frame of each continuous frame section constituting a process target section, which is generated based on a result of calculating an estimated value of a high frequency sub-band power of a high frequency sub-band signal of the audio signal based on a low frequency sub-band signal of the audio signal and a predetermined estimation coefficient, determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in the process target section including a plurality of frames of the audio signal based on a number-of-sections determining feature amount extracted from the audio signal, wherein the number-of-sections determining feature amount is calculated by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed, and selecting the estimation coefficient of a frame constituting the continuous frame section from a plurality of estimation coefficients based on the estimated value and the high frequency sub-band power in each of the continuous frame sections obtained by dividing the process target section based on the determined number of continuous frame sections, and low frequency encoded data obtained by encoding a low frequency signal of the input signal;
  
  generating, by the processing circuitry, a low frequency signal by decoding the low frequency encoded data;
  
  generating, by the processing circuitry, a high frequency signal based on the estimation coefficient obtained from the data and the low frequency signal obtained from the decoding;
  
  generating, by the processing circuitry, the audio signal based on the high frequency signal and the low frequency signal obtained from the decoding; and
  
  outputting, by the processing circuitry, the audio signal.

18. A computer-readable storage device encoded with computer-executable instructions that, when executed by processing circuitry, perform an encoding method comprising:
- receiving an input code string representative of an audio signal;
  
  demultiplexing the input code string into data for obtaining an estimation coefficient selected in a frame of each continuous frame section constituting a process target section, which is generated based on a result of calculating an estimated value of a high frequency sub-band power of a high frequency sub-band signal of the audio signal based on a low frequency sub-band signal of the audio signal and a predetermined estimation coefficient, determining the number of continuous frame sections including frames for which the same estimation coefficient is selected in the process target section including a plurality of frames of the audio signal based on a number-of-sections determining feature amount extracted from the audio signal, wherein the number-of-sections determining feature amount is calculated by calculating a sub-band power sum of the power of the sub-band signal of the sub-bands on the high frequency side of the input signal, wherein the sub-band power sum is an estimated bandwidth of a frame to be processed, and selecting the estimation coefficient of a frame constituting the continuous frame section from a plurality of estimation coefficients based on the estimated value and the high frequency sub-band power in each of the continuous frame sections obtained by dividing the process target section based on the determined number of continuous frame sections, and low frequency encoded data obtained by encoding a low frequency signal of the input signal;
  
  generating a low frequency signal by decoding the low frequency encoded data;
  
  generating a high frequency signal based on the estimation coefficient obtained from the data and the low frequency signal obtained from the decoding;
  
  generating the audio signal based on the high frequency signal and the low frequency signal obtained from the decoding; and
  
  outputting the audio signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.)
Inventors
Yamamoto, Yuki, Chinen, Toru
Primary Examiner(s)
Mishra, Richa

Application Number

US14/236,350
Publication Number

US 20140200899A1
Time in Patent Office

1,946 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 19/0204   using subband decomposition

G10L 19/022   Blocking, i.e. grouping of ...

G10L 19/265   Pre-filtering, e.g. high fr...

G10L 21/038   using band spreading techni...

G10L 25/21   the extracted parameters be...

Encoding device and encoding method, decoding device and decoding method, and program

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

61 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Encoding device and encoding method, decoding device and decoding method, and program

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

61 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links