Scalable speech coding/decoding apparatus, method, and medium having mixed structure

US 8,271,267 B2
Filed: 07/21/2006
Issued: 09/18/2012
Est. Priority Date: 07/22/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A scalable speech coding apparatus having a mixed structure, the apparatus comprising:

a band divider to divide a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;

a low-band coder to output a low-band first index by coding the low-band signal, to transmit information required for coding the high-band signal to a high-band coder, and to transmit a error signal obtained from the low-band signal and a signal generated during coding the low-band signal;

a high-band coder to output a high-band second index obtained when the high-band signal is coded by using information received from the low-band coder, and to transmit a second error signal obtained from the high-band signal and a signal generated during coding the high-band signal;

a wide-band coder to obtain a wide-band third index from the first and second error signals using a modified discrete cosine transform (MDCT); and

a bit-stream generator to output a scalable bit-stream composed of the low-band first index received from the low-band coder, the high-band second index received from the high-band coder, and the wide-band third index received from the wide-band coder.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Provided are a scalable wide-band speech coding/decoding apparatus, method, and medium. An input wide-band speech input signal is first divided into a low-band signal and a high-band signal. The divided low-band signal is then coded using a code excited linear prediction (CELP) method. The divided high-band signal is coded using a harmonic method. A signal representing a difference between a synthetic signal obtained from the low-band and the high band, and a signal input to the low-band and the high-band is then coded using a modified discrete cosine transform (MDCT) method. The coded signal is then multiplexed. The multiplexed signal is then output. Accordingly, high quality speech can be achieved for all layers.

Citations

32 Claims

1. A scalable speech coding apparatus having a mixed structure, the apparatus comprising:
- a band divider to divide a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  a low-band coder to output a low-band first index by coding the low-band signal, to transmit information required for coding the high-band signal to a high-band coder, and to transmit a error signal obtained from the low-band signal and a signal generated during coding the low-band signal;
  
  a high-band coder to output a high-band second index obtained when the high-band signal is coded by using information received from the low-band coder, and to transmit a second error signal obtained from the high-band signal and a signal generated during coding the high-band signal;
  
  a wide-band coder to obtain a wide-band third index from the first and second error signals using a modified discrete cosine transform (MDCT); and
  
  a bit-stream generator to output a scalable bit-stream composed of the low-band first index received from the low-band coder, the high-band second index received from the high-band coder, and the wide-band third index received from the wide-band coder.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The apparatus of claim 1, wherein the bit-stream is combined with narrow-band information composed of one or more layers obtained by using the low-band first index, and wide-band information composed of one or more layers obtained by using the high-band second index and the low-band third index.
  - 3. The apparatus of claim 1, wherein:
    - the first error signal is an expression error signal which represents a difference between a low-band signal input to the low-band coder and a first synthetic signal synthesized using an excited signal generated from the low-band coder; and
      
      the second error signal is an expression error signal which represents a difference between a high-band signal input to the high-band coder and a second synthetic signal synthesized using an excited signal generated by the high-band coder using harmonic synthesis.
  - 4. The apparatus of claim 1, wherein the low-band coder generates the low-band first index which is obtained by multiplexing a low-band signal input to the low-band coder using a code excited linear prediction (CELP) method.
  - 5. The apparatus of claim 1, wherein the low-band coder has a CELP structure in which a high-band signal received using the CELP method is filtered, and an excited signal of the filtered high-band signal is generated by searching for a fixed codebook and an adaptive codebook.
  - 6. The apparatus of claim 1, wherein:
    - the information required for coding the high-band signal comprises information on low-band pitch delay and information on a low-band excited signal energy; and
      
      the high-band coder uses a harmonic coding method so as to generate the high-band second index obtained by multiplexing a first parameter obtained by quantizing a linear prediction coding coefficient, a second parameter which determines a harmonic component to be coded by using the information on pitch delay received from the low-band coder and which is obtained by quantizing a harmonic phase based on the determined result, and a third parameter obtained by quantizing a high-band effective power by using the information on low-band excited signal energy received from the low-band coder.

7. A scalable speech coding method having a mixed structure, the method comprising:
- (a) dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  (b) generating and outputting a low-band first index by coding the output low-band signal, and outputting specific information required for coding the high-band signal and a first error signal obtained from the low-band signal;
  
  (c) coding the output high-band signal by using the specific information, and outputting a high-band second index and a second error signal obtained from the high-band signal;
  
  (d) obtaining a wide-band third index from the first and second error signals using a modified discrete cosine transform (MDCT); and
  
  (e) outputting a scalable bit-stream composed of the low-band first index, the high-band second index, and the wide-band third index.
- View Dependent Claims (8, 9, 10, 11, 12, 25, 26, 27, 28)
- - 8. The method of claim 7, wherein the bit-stream is combined with narrow-band information composed of one or more layers obtained by using the low-band first index, and wide-band information composed of one or more layers obtained by using the high-band second index and the low-band third index.
  - 9. The method of claim 7, wherein:
    - the first error signal is an expression error signal which represents a difference between a low-band signal input to the low-band coder generating the first index, and a first synthetic signal synthesized by using an excited signal generated from the low-band coder; and
      
      the second error signal is an expression error signal which represents a difference between a high-band signal input to the high-band coder generating the second index, and a second synthetic signal synthesized by using an excited signal generated by the high-band coder using harmonic synthesis.
  - 10. The method of claim 7, wherein, in (b), the first index is generated by multiplexing a low-band signal input to the low-band coder using a code excited linear prediction (CELP) method.
  - 11. The method of claim 7, wherein:
    - the specific information comprises information on low-band pitch delay and information on a low-band excited signal energy; and
      
      the low-band coder uses a harmonic coding method so as to generate the high-band second index obtained by multiplexing a first parameter obtained by quantizing a linear prediction coding coefficient, a second parameter obtained by quantizing a harmonic phase based on the determined result, and a third parameter obtained by quantizing a high-band effective power using the information on low-band excited signal energy received from the low-band coder.
  - 12. A non-transitory computer-readable medium comprising computer readable instructions implementing the method of claim 7.
  - 25. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 8.
  - 26. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 9.
  - 27. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 10.
  - 28. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 11.

13. A scalable speech decoding apparatus having a mixed structure, the apparatus comprising:
- a bit-stream divider to receive a scalable bit-stream transmitted at a specific transmission rate according to a network condition, and to generate a low-band signal, a high-band signal, and a wide band signal by dividing the scalable bit-stream according to a frequency band used in reproduction;
  
  a low-band decoder to receive the low-band signal into which the scalable bitstream is divided by the bit-stream divider, to decode and output the received low-band signal, and to transmit specific information required for decoding a high-band signal among coefficients decoded in a low-band;
  
  a high-band decoder to decode and output the high-band signal into which the scalable bit-stream is divided by the bitstream divider, using the specific information;
  
  a wide-band decoder to decode the wide-band signal into which the scalable bitstream is divided by the bit-stream divider, and to divide and output the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  a band combiner to output a wide-band synthetic signal of a combined band using a signal output from the low-band decoder, a signal output from the high-band decoder, the low-band signal output from the wide-band decoder, and the high-band signal output from the wide-band decoder.
- View Dependent Claims (14, 15, 16)
- - 14. The apparatus of claim 13, wherein the wide-band synthetic signal comprises a low-band output having one or more layers of low-band signal, and a wide-band output having one or more layers of high-band signal and wide-band signal.
  - 15. The apparatus of claim 13, wherein the low-band decoder decodes an input bit-stream using a code excited linear prediction (CELP) method.
  - 16. The apparatus of claim 13, wherein:
    - the specific information comprises a low-band pitch signal; and
      
      the high-band decoder obtains a harmonic position by using the low-band pitch signal, and decodes the received bit-stream by using harmonic information associated with the obtained harmonic position.

17. A scalable speech decoding method having a mixed structure, the method comprising:
- (a) receiving a scalable bit-stream transmitted at a specific transmission rate according to a network condition, and dividing and outputting the scalable bit-stream into a low-band signal, a high-band signal, and a wide-band signal according to a frequency band used for reproduction;
  
  (b) receiving the low-band signal of the scalable bitstream, decoding and outputting the received low-band signal, and outputting information on a pitch signal among coefficients decoded in a low-band;
  
  (c) receiving the high-band signal of the scalable bitstream and the pitch signal information, and decoding and outputting the high-band signal by using the pitch signal information;
  
  (d) receiving and decoding the wide-band signal of the scalable bitstream, and dividing and outputting the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  (e) outputting a wide-band synthetic signal of a combined band by using a signal output in (b), a signal output in (c), a low-band signal output in (d), and a high-band signal output in (d).
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
- - 18. The method of claim 17, wherein the wide-band synthetic signal comprises a low-band output having one or more layers of low-band signal, and a wide-band output having one or more layers of high-band signal and wide-band signal.
  - 19. The method of claim 17, wherein, in (b), an input bit-stream is decoded by using a code excited linear prediction (CELP) method.
  - 20. The method of claim 17, wherein, in (c), a harmonic position is obtained by using the low-band pitch signal, and the received bit-stream is decoded by using harmonic information associated with the obtained harmonic position.
  - 21. A non-transitory computer-readable medium comprising computer readable instructions implementing the method of claim 17.
  - 22. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 18.
  - 23. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 19.
  - 24. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 20.

29. A scalable speech coding method having a mixed structure, the apparatus comprising:
- dividing a speech input signal into a low-band signal and a high-band signal according to a specific frequency, and outputting the low-band signal and the high-band signal;
  
  outputting a low-band first index by coding a low-band signal, outputting information required for coding a high-band signal, and outputting a first error signal obtained from the low-band signal;
  
  outputting a high-band second index obtained when the high-band signal is coded by using the information required for coding a high-band signal, and outputting a second error signal obtained from the high-band signal;
  
  obtaining a wide-band third index from the first and second error signals using a modified discrete cosine transform (MDCT); and
  
  outputting a scalable bit-stream composed of the low-band first index, the high-band second index, and the wide-band third index.
- View Dependent Claims (30)
- - 30. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 29.

31. A scalable speech decoding method having a mixed structure for decoding a scalable bit-stream, the method comprising:
- (a) receiving a low-band signal of the scalable bitstream, decoding and outputting the received low-band signal, and outputting information on a pitch signal among coefficients decoded in a low-band;
  
  (b) receiving a high-band signal of the scalable bitstream and the pitch signal information, and decoding and outputting the high-band signal by using the pitch signal information;
  
  (c) receiving and decoding a wide-band signal of the scalable bitstream, and dividing and outputting the decoded wide-band signal into a low-band signal and a high-band signal according to a specific frequency; and
  
  (d) outputting a wide-band synthetic signal of a combined band by using a signal output in (a), a signal output in (b), a low-band signal output in (c), and a high-band signal output in (c).
- View Dependent Claims (32)
- - 32. A non-transitory computer readable medium comprising computer readable instructions implementing the method of claim 31.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Sung, Hosang, Kim, Sangwook, Taori, Rakesh, Lee, Kangeun
Primary Examiner(s)
Opsasnick, Michael N
Assistant Examiner(s)
ORTIZ SANCHEZ, MICHAEL

Application Number

US11/490,139
Publication Number

US 20070033023A1
Time in Patent Office

2,251 Days
Field of Search

704/201, 704/219
US Class Current

704/201
CPC Class Codes

G10L 19/0212   using orthogonal transforma...

G10L 19/24   Variable rate codecs, e.g. ...

G10L 25/18   the extracted parameters be...

Scalable speech coding/decoding apparatus, method, and medium having mixed structure

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Scalable speech coding/decoding apparatus, method, and medium having mixed structure

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links