Scalable speech coding/decoding apparatus, method, and medium having mixed structure

US 20070033023A1
Filed: 07/21/2006
Published: 02/08/2007
Est. Priority Date: 07/22/2005
Status: Active Grant

First Claim

Patent Images

1. A scalable speech coding apparatus having a mixed structure, the apparatus comprising:

a band divider dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;

a low-band coder outputting a low-band first index by coding the low-band signal, transmitting information required for coding the high-band signal to a high-band coder, and transmitting an uncoded first error signal to a wide-band coder;

a high-band coder outputting a high-band second index obtained when the high-band signal is coded by using information received from the low-band coder, and transmitting an uncoded second error signal to the wide-band coder;

a wide-band coder quantizing coefficients of the first and second error signals using a modified discrete cosine transform (MDCT) method through time-frequency mapping, and outputting a wide-band third index; and

a bit-stream generator outputting a scalable bit-stream composed of the low-band first index received from the low-band coder, the high-band second index received from the high-band coder, and the wide-band third index received from the wide-band coder.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Provided are a scalable wide-band speech coding/decoding apparatus, method, and medium. An input wide-band speech input signal is first divided into a low-band signal and a high-band signal. The divided low-band signal is then coded using a code excited linear prediction (CELP) method. The divided high-band signal is coded using a harmonic method. A signal representing a difference between a synthetic signal obtained from the low-band and the high band, and a signal input to the low-band and the high-band is then coded using a modified discrete cosine transform (MDCT) method. The coded signal is then multiplexed. The multiplexed signal is then output. Accordingly, high quality speech can be achieved for all layers.

55 Citations

View as Search Results

32 Claims

1. A scalable speech coding apparatus having a mixed structure, the apparatus comprising:
- a band divider dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  a low-band coder outputting a low-band first index by coding the low-band signal, transmitting information required for coding the high-band signal to a high-band coder, and transmitting an uncoded first error signal to a wide-band coder;
  
  a high-band coder outputting a high-band second index obtained when the high-band signal is coded by using information received from the low-band coder, and transmitting an uncoded second error signal to the wide-band coder;
  
  a wide-band coder quantizing coefficients of the first and second error signals using a modified discrete cosine transform (MDCT) method through time-frequency mapping, and outputting a wide-band third index; and
  
  a bit-stream generator outputting a scalable bit-stream composed of the low-band first index received from the low-band coder, the high-band second index received from the high-band coder, and the wide-band third index received from the wide-band coder.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The apparatus of claim 1, wherein the bit-stream is combined with narrow-band information composed of one or more layers obtained by using the low-band first index, and wide-band information composed of one or more layers obtained by using the high-band second index and the low-band third index.
  - 3. The apparatus of claim 1, wherein:
    - the first error signal is an expression error signal which represents a difference between a low-band signal input to the low-band coder and a first synthetic signal synthesized using an excited signal generated from the low-band coder; and
      
      the second error signal is an expression error signal which represents a difference between a high-band signal input to the high-band coder and a second synthetic signal synthesized using an excited signal generated by the high-band coder using harmonic synthesis.
  - 4. The apparatus of claim 1, wherein the low-band coder generates the low-band first index which is obtained by multiplexing a low-band signal input to the low-band coder using a code excited linear prediction (CELP) method.
  - 5. The apparatus of claim 1, wherein the low-band coder has a CELP structure in which a high-band signal received using the CELP method is filtered, and an excited signal of the filtered high-band signal is generated by searching for a fixed codebook and an adaptive codebook.
  - 6. The apparatus of claim 1, wherein:
    - the information required for coding the high-band signal comprises information on low-band pitch delay and information on a low-band excited signal energy; and
      
      the high-band coder uses a harmonic coding method so as to generate the high-band second index obtained by multiplexing a first parameter obtained by quantizing a linear prediction coding coefficient, a second parameter which determines a harmonic component to be coded by using the information on pitch delay received from the low-band coder and which is obtained by quantizing a harmonic phase based on the determined result, and a third parameter obtained by quantizing a high-band effective power by using the information on low-band excited signal energy received from the low-band coder.

7. A scalable speech coding method having a mixed structure, the method comprising:
- (a) dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  (b) generating and outputting a low-band first index by coding the output low-band signal, and outputting specific information required for coding the high-band signal and an uncoded first error signal;
  
  (c) coding the output high-band signal by using the specific information, and outputting a high-band second index and an uncoded second error signal;
  
  (d) quantizing coefficients of the first and second error signals using a modified discrete cosine transform (MDCT) through time-frequency mapping, and outputting a low-band third index; and
  
  (e) outputting a scalable bit-stream composed of the low-band first index, the high-band second index, and the wide-band third index.
- View Dependent Claims (8, 9, 10, 11, 12, 25, 26, 27, 28)
- - 8. The method of claim 7, wherein the bit-stream is combined with narrow-band information composed of one or more layers obtained by using the low-band first index, and wide-band information composed of one or more layers obtained by using the high-band second index and the low-band third index.
  - 9. The method of claim 7, wherein:
    - the first error signal is an expression error signal which represents a difference between a low-band signal input to the low-band coder generating the first index, and a first synthetic signal synthesized by using an excited signal generated from the low-band coder; and
      
      the second error signal is an expression error signal which represents a difference between a high-band signal input to the high-band coder generating the second index, and a second synthetic signal synthesized by using an excited signal generated by the high-band coder using harmonic synthesis.
  - 10. The method of claim 7, wherein, in (b), the first index is generated by multiplexing a low-band signal input to the low-band coder using a code excited linear prediction (CELP) method.
  - 11. The method of claim 7, wherein:
    - the specific information comprises information on low-band pitch delay and information on a low-band excited signal energy; and
      
      the low-band coder uses a harmonic coding method so as to generate the high-band second index obtained by multiplexing a first parameter obtained by quantizing a linear prediction coding coefficient, a second parameter obtained by quantizing a harmonic phase based on the determined result, and a third parameter obtained by quantizing a high-band effective power using the information on low-band excited signal energy received from the low-band coder.
  - 12. A computer-readable medium comprising computer readable instructions implementing the method of claim 7.
  - 25. A computer readable medium comprising computer readable instructions implementing the method of claim 8.
  - 26. A computer readable medium comprising computer readable instructions implementing the method of claim 9.
  - 27. A computer readable medium comprising computer readable instructions implementing the method of claim 10.
  - 28. A computer readable medium comprising computer readable instructions implementing the method of claim 11.

13. A scalable speech decoding apparatus having a mixed structure, the apparatus comprising:
- a bit-stream divider receiving a scalable bit-stream transmitted at a specific transmission rate according to a network condition, and transmitting the scalable bit-stream to each decoder of a corresponding frequency band by dividing the scalable bit-stream according to a frequency band used in reproduction;
  
  a low-band decoder receiving a low-band signal into which the scalable bitstream is divided by the bit-stream divider, decoding and outputting the received low-band signal, and transmitting specific information required for decoding a high-band signal among coefficients decoded in a low-band;
  
  a high-band decoder decoding and outputting a high-band signal into which the scalable bit-stream is divided by the bitstream divider, using the specific information;
  
  a wide-band decoder decoding a wide-band signal into which the scalable bitstream is divided by the bit-stream divider, and dividing and outputting the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  a band combiner outputting a wide-band synthetic signal of a combined band by receiving a first synthetic signal, which is generated when a signal output from the low-band decoder is combined with the low-band signal output from the wide-band decoder, and a second synthetic signal which is generated when a signal output from the high-band decoder is combined with the high-band signal output from the wide-band decoder.
- View Dependent Claims (14, 15, 16)
- - 14. The apparatus of claim 13, wherein the wide-band synthetic signal comprises a low-band output having one or more layers of low-band signal, and a wide-band output having one or more layers of high-band signal and wide-band signal.
  - 15. The apparatus of claim 13, wherein the low-band decoder decodes an input bit-stream using a code excited linear prediction (CELP) method.
  - 16. The apparatus of claim 13, wherein:
    - the specific information comprises a low-band pitch signal; and
      
      the high-band decoder obtains a harmonic position by using the low-band pitch signal, and decodes the received bit-stream by using harmonic information associated with the obtained harmonic position.

17. A scalable speech decoding method having a mixed structure, the method comprising:
- (a) receiving a scalable bit-stream transmitted at a specific transmission rate according to a network condition, and dividing and outputting the scalable bit-stream into a low-band signal, a high-band signal, and a wide-band signal according to a frequency band used for reproduction;
  
  (b) receiving the low-band signal of the scalable bitstream, decoding and outputting the received low-band signal, and outputting information on a pitch signal among coefficients decoded in a low-band;
  
  (c) receiving the high-band signal of the scalable bitstream and the pitch signal information, and decoding and outputting the high-band signal by using the pitch signal information;
  
  (d) receiving and decoding the wide-band signal of the scalable bitstream, and dividing and outputting the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  (e) outputting a wide-band synthetic signal of a combined band by receiving a first synthetic signal, which is generated when a signal output in (b) is combined with a low-band signal output in (d), and a second synthetic signal which is generated when a signal output in (c) is combined with a high-band signal output in (d).
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
- - 18. The method of claim 17, wherein the wide-band synthetic signal comprises a low-band output having one or more layers of low-band signal, and a wide-band output having one or more layers of high-band signal and wide-band signal.
  - 19. The method of claim 17, wherein, in (b), an input bit-stream is decoded by using a code excited linear prediction (CELP) method.
  - 20. The method of claim 17, wherein, in (c), a harmonic position is obtained by using the low-band pitch signal, and the received bit-stream is decoded by using harmonic information associated with the obtained harmonic position.
  - 21. A computer-readable medium comprising computer readable instructions implementing the method of claim 17.
  - 22. A computer readable medium comprising computer readable instructions implementing the method of claim 18.
  - 23. A computer readable medium comprising computer readable instructions implementing the method of claim 19.
  - 24. A computer readable medium comprising computer readable instructions implementing the method of claim 20.

29. A scalable speech coding apparatus having a mixed structure, the apparatus comprising:
- a band divider dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  a low-band coder outputting a low-band first index by coding a low-band signal, outputting information required for coding a high-band signal, and transmitting an uncoded first error signal to a wide-band coder;
  
  a high-band coder outputting a high-band second index obtained when the high-band signal is coded by using outputted information received from the low-band coder, and transmitting an uncoded second error signal to the wide-band coder;
  
  a wide-band coder quantizing coefficients of the first and second error signals using a modified discrete cosine transform (MDCT) method through time-frequency mapping, and outputting a wide-band third index; and
  
  a bit-stream generator outputting a scalable bit-stream composed of the low-band first index received from the low-band coder, the high-band second index received from the high-band coder, and the wide-band third index received from the wide-band coder.
- View Dependent Claims (30)
- - 30. A computer readable medium comprising computer readable instructions implementing the method of claim 29.

31. A scalable speech decoding method having a mixed structure for decoding a scalable bit-stream, the method comprising:
- (a) receiving a low-band signal of the scalable bitstream, decoding and outputting the received low-band signal, and outputting information on a pitch signal among coefficients decoded in a low-band;
  
  (b) receiving a high-band signal of the scalable bitstream and the pitch signal information, and decoding and outputting the high-band signal by using the pitch signal information;
  
  (c) receiving and decoding a wide-band signal of the scalable bitstream, and dividing and outputting the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  (d) outputting a wide-band synthetic signal of a combined band by receiving a first synthetic signal, which is generated when a signal output in (a) is combined with a low-band signal output in (c), and a second synthetic signal which is generated when a signal output in (b) is combined with a high-band signal output in (c).
- View Dependent Claims (32)
- - 32. A computer readable medium comprising computer readable instructions implementing the method of claim 31.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Taori, Rakesh, Lee, Kangeun, Sung, Hosang, Kim, Sangwook

Granted Patent

US 8,271,267 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/229
CPC Class Codes

G10L 19/0212   using orthogonal transforma...

G10L 19/24   Variable rate codecs, e.g. ...

G10L 25/18   the extracted parameters be...

Scalable speech coding/decoding apparatus, method, and medium having mixed structure

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

55 Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Scalable speech coding/decoding apparatus, method, and medium having mixed structure

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links