Encoding device, decoding device, and method thereof for secifying a band of a great error

US 8,935,161 B2
Filed: 08/14/2013
Issued: 01/13/2015
Est. Priority Date: 03/02/2007
Status: Active Grant

First Claim

Patent Images

1. A speech encoding apparatus, comprising:

a first layer encoder that performs encoding processing, using a processor, with respect to an input speech signal to generate first layer encoded data;

a first layer decoder that performs decoding processing, using the processor, using the first layer encoded data to generate a first layer decoded signal;

a first layer error transform coefficient calculator that transforms, using the processor, a first layer error signal which is an error between the input speech signal and the first layer decoded signal into a frequency domain to calculate first layer error transform coefficients; and

a second layer encoder that performs encoding processing, using the processor, with respect to the first layer error transform coefficients to generate second layer encoded data,wherein the second layer encoder;

selects a first band from among a plurality of band candidates, which have a predetermined bandwidth and are arranged based on a first step size narrower than the predetermined bandwidth, based on a magnitude of energy of the first layer error transform coefficients in the plurality of band candidates, and generates first position information showing a position of the selected first band;

specifies positions of a plurality of pulses from among pulse candidate positions, which are set based on a second step size narrower than the first step size in the selected first band, and generates second position information showing the specified positions of the plurality of pulses; and

generates the second layer encoded data using the first position information and the second position information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is an encoding device which can accurately specify a band having a large error among all the bands by using a small calculation amount. A first position identifier uses a first layer error conversion coefficient indicating an error of a decoding signal for an input signal so as to search for a band having a large error in a relatively wide bandwidth in all the bands of the input signal and generates first position information indicating the identified band. A second position identifier searches for a target frequency band having a large error in a relatively narrow bandwidth in the band identified by the first position identifier and generates second position information indicating the identified target frequency band. An encoder encodes a first layer decoding error conversion coefficient contained in the target frequency band.

Citations

8 Claims

1. A speech encoding apparatus, comprising:
- a first layer encoder that performs encoding processing, using a processor, with respect to an input speech signal to generate first layer encoded data;
  
  a first layer decoder that performs decoding processing, using the processor, using the first layer encoded data to generate a first layer decoded signal;
  
  a first layer error transform coefficient calculator that transforms, using the processor, a first layer error signal which is an error between the input speech signal and the first layer decoded signal into a frequency domain to calculate first layer error transform coefficients; and
  
  a second layer encoder that performs encoding processing, using the processor, with respect to the first layer error transform coefficients to generate second layer encoded data,wherein the second layer encoder;
  
  selects a first band from among a plurality of band candidates, which have a predetermined bandwidth and are arranged based on a first step size narrower than the predetermined bandwidth, based on a magnitude of energy of the first layer error transform coefficients in the plurality of band candidates, and generates first position information showing a position of the selected first band;
  
  specifies positions of a plurality of pulses from among pulse candidate positions, which are set based on a second step size narrower than the first step size in the selected first band, and generates second position information showing the specified positions of the plurality of pulses; and
  
  generates the second layer encoded data using the first position information and the second position information.
- View Dependent Claims (2, 3, 4)
- - 2. The speech encoding apparatus according to claim 1, wherein the second layer encoder specifies a position of a pulse based on the magnitude of energy of the first layer error transform coefficients.
  - 3. The speech encoding apparatus according to claim 1, wherein the second layer encoder generates gain information showing an amplitude of a pulse at a pulse position based on the first layer error transform coefficients, andthe second layer encoder generates the second layer encoded data further using the gain information.
  - 4. The speech encoding apparatus according to claim 1, wherein the second layer encoder selects the first band from a low-frequency portion lower than a predetermined reference frequency.

5. A speech decoding apparatus, comprising:
- a receiver that receives, using a processor;
  
  first layer encoded data acquired in a speech encoder by performing encoding processing with respect to an input speech signal; and
  
  second layer encoded data acquired in the speech encoder by transforming a first layer error signal which is an error between a first layer decoded signal obtained by decoding the first layer encoded data and the input speech signal into a frequency domain to calculate first layer error transform coefficients and by performing encoding processing with respect to the first layer error transform coefficients;
  
  a first layer decoder that decodes, using the processor, the first layer encoded data to generate the first layer decoded signal;
  
  a second layer decoder that decodes, using the processor, the second layer encoded data to generate first layer decoded error transform coefficients;
  
  a time domain transformer that transforms, using the processor, the first layer decoded error transform coefficients into a time domain to generate a first layer decoded error signal; and
  
  an adder that adds the first layer decoded signal and the first layer decoded error signal to generate a decoded signal,wherein the second layer decoder;
  
  decodes the second layer encoded data to generate first position information showing a position of a first band having a predetermined bandwidth and second position information showing positions of a plurality of pulses in the first band; and
  
  specifies the positions of the plurality of pulses using the first position information and the second position information to generate the first layer decoded error transform coefficients.
- View Dependent Claims (6)
- - 6. The speech decoding apparatus according to claim 5, wherein the second layer decoder decodes the second layer encoded data to generate gain information showing an amplitude of a pulse, and generates the first layer decoded error transform coefficients further using the gain information.

7. A speech encoding method, comprising:
- performing encoding processing, by a processor, with respect to an input speech signal to generate first layer encoded data;
  
  performing decoding processing, by the processor, using the first layer encoded data to generate a first layer decoded signal;
  
  transforming, by the processor, a first layer error signal which is an error between the input speech signal and the first layer decoded signal into a frequency domain to calculate first layer error transform coefficients; and
  
  performing encoding processing, by the processor, with respect to the first layer error transform coefficients to generate second layer encoded data,wherein the encoding processing with respect to the first layer error transform coefficients comprises;
  
  selecting a first band from among a plurality of band candidates, which have a predetermined bandwidth and are arranged based on a first step size narrower than the predetermined bandwidth, based on a magnitude of energy of the first layer error transform coefficients in the plurality of band candidates, and generating first position information showing a position of the selected first band;
  
  specifying positions of a plurality of pulses from among pulse candidate positions, which are set based on a second step size narrower than the first step size in the selected first band, and generating second position information showing the specified positions of the plurality of pulses; and
  
  generating the second layer encoded data using the first position information and the second position information.

8. A speech decoding method, comprising:
- receiving, by a processor;
  
  first layer encoded data acquired using a speech encoding method by performing encoding processing with respect to an input speech signal; and
  
  second layer encoded data acquired using the speech encoding method by transforming a first layer error signal which is an error between a first layer decoded signal obtained by decoding the first layer encoded data and the input speech signal into a frequency domain to calculate first layer error transform coefficients and by performing encoding processing with respect to the first layer error transform coefficients;
  
  decoding, by the processor, the first layer encoded data to generate the first layer decoded signal;
  
  decoding, by the processor, the second layer encoded data to generate first layer decoded error transform coefficients;
  
  transforming, by the processor, the first layer decoded error transform coefficients into a time domain to generate a first layer decoded error signal; and
  
  adding, by the processor, the first layer decoded signal and the first layer decoded error signal to generate a decoded signal, whereinin the decoding of the second layer encoded data;
  
  the second layer encoded data is decoded to generate first position information showing a position of a first band having a predetermined bandwidth and second position information showing positions of a plurality of pulses in the first band; and
  
  the positions of the plurality of pulses are specified using the first position information and the second position information to generate the first layer decoded error transform coefficients.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Original Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Inventors
Oshikiri, Masahiro, Yamanashi, Tomofumi, Morii, Toshiyuki
Primary Examiner(s)
Chawan, Vijay B

Application Number

US13/966,819
Publication Number

US 20140019144A1
Time in Patent Office

517 Days
Field of Search

704/205, 704/230, 704500-504, 704/203, 704/207, 704/225, 704/229, 704/219, 704226-228, 370/401, 370/468, 375/240.11, 700/94
US Class Current

704/230
CPC Class Codes

G10L 19/00   Speech or audio signals ana...

G10L 19/005   Correction of errors induce...

G10L 19/0204   using subband decomposition

G10L 19/0208   Subband vocoders

G10L 19/0212   using orthogonal transforma...

G10L 19/24   Variable rate codecs, e.g. ...

Encoding device, decoding device, and method thereof for secifying a band of a great error

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Encoding device, decoding device, and method thereof for secifying a band of a great error

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links