Celp-Based speech coding for fine grain scalability by altering sub-frame pitch-pulse

US 6,996,522 B2
Filed: 09/13/2001
Issued: 02/07/2006
Est. Priority Date: 03/13/2001
Status: Active Grant

First Claim

Patent Images

1. A method of encoding a speech signal in a code excited linear prediction (CELP)-based speech processing system that includes an adaptive codebook and a fixed codebook, wherein the speech signal is divided into frames and each frame is further divided into sequential sub-frames, the method comprising:

generating linear prediction coding (LPC) coefficients for a frame;

generating pitch-related information by using the adaptive codebook, for the sequential sub-frames of the frame;

generating fixed-code pulse information by using the fixed codebook, for a plurality of selected sub-frames of the frame;

generating a first bit-stream corresponding to the frame for the LPC coefficients, the pitch-related information, and the fixed-code pulse information for the plurality of selected sub-frames;

generating fixed-code pulse information by using the fixed codebook, for unselected sub-frames; and

separately generating a second bit-stream corresponding to speech enhancement of the frame from the fixed-code pulse information for the unselected sub-frames.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for providing a CELP-based speech coding with fine grain scalability include a parameter encoder that generates a basic bit-stream from LPC coefficients for a frame, pitch-related information for all the sub-frames obtained by searching an adaptive codebook, and first pulse-related information for even sub-frames obtained by searching a fixed codebook. The parameter encoder also generates enhancement bits, which are preceded by the basic bit-stream, from second pulse-related information for odd sub-frames. The quality of synthesized speech is improved on a basis of one additional odd sub-frame pulse, as more of the second pulse-related information in the enhancement bits is received by a decoder.

Citations

18 Claims

1. A method of encoding a speech signal in a code excited linear prediction (CELP)-based speech processing system that includes an adaptive codebook and a fixed codebook, wherein the speech signal is divided into frames and each frame is further divided into sequential sub-frames, the method comprising:
- generating linear prediction coding (LPC) coefficients for a frame;
  
  generating pitch-related information by using the adaptive codebook, for the sequential sub-frames of the frame;
  
  generating fixed-code pulse information by using the fixed codebook, for a plurality of selected sub-frames of the frame;
  
  generating a first bit-stream corresponding to the frame for the LPC coefficients, the pitch-related information, and the fixed-code pulse information for the plurality of selected sub-frames;
  
  generating fixed-code pulse information by using the fixed codebook, for unselected sub-frames; and
  
  separately generating a second bit-stream corresponding to speech enhancement of the frame from the fixed-code pulse information for the unselected sub-frames.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the first bit-stream provides a minimum quality when synthesized into speech, and the second bit-stream provides improved quality of the synthesized speech.
  - 3. The method of claim 2,wherein the selected sub-frames are even sub-frames of the frame, and the unselected sub-frames are odd sub-frames of the frame.
  - 4. The method of claim 1, further comprising placing the second bit-stream after the first bit-stream.
  - 5. The method of claim 4, wherein the generating of fixed-code pulse information for the unselected sub-frames includes generating information for a plurality of pulses, and in the second bit-stream, placing all information for one pulse before information of another pulse.
  - 6. The method of claim 1, further comprising:
    - using the pulse-related information in addition to the pitch-related information for a selected sub-frame to generate pitch-related information and fixed-code pulse information for a succeeding sub-frame; and
      
      using the pitch-related information without the pulse-related information for an unselected sub-frame to generate pitch-related information and fixed-code pulse information for a succeeding sub-frame.
  - 7. The method of claim 1, further comprising:
    - searching the adaptive codebook and the fixed codebook to minimize a difference between a synthesized speech and a target signal to generate the pitch-related information and the fixed-code pulse information; and
      
      linearly attenuating a magnitude of samples in the target signal for an unselected sub-frame, the number of samples corresponding to the order of an LP-synthesis filter.

8. A method of synthesizing speech in a code excited linear prediction (CELP)-based speech processing system that includes an adaptive codebook and a fixed codebook, wherein a speech signal is divided into frames and each frame is further divided into sub-frames, the method comprising:
- receiving a basic bit-stream which includeslinear prediction coding (LPC) coefficients for a frame,pitch-related information for all sub-frames of the frame, andfirst pulse-related information for a plurality of selected sub-frames of the frame;
  
  receiving enhancement bits which include second pulse-related information for unselected sub-frames of the frame;
  
  generating an excitationby referring to the adaptive codebookbased on the pitch-related information included in the basic bit-stream; and
  
  by referring to the fixed codebookbased on the first pulse-related information included in the basic bit-stream;
  
  generating an excitationby referring to the adaptive codebookbased on the pitch-related information included in the basic bit-stream andby referring to the fixed codebookbased on the part or the whole of the second pulse-related information included in the enhancement bits; and
  
  outputting synthesized speech according to the excitations and the LPC coefficients.
- View Dependent Claims (9, 10, 11)
- - 9. The method of claim 8, wherein the plurality of selected sub-frames are even sub-frames of the frame, and the unselected sub-frames of the frame.
  - 10. The method of claim 8, wherein the second pulse-related information includes information for a plurality of pulses, and quality of the synthesized speech is improved each time information for one pulse is added to the enhancement bits received.
  - 11. The method of claim 8, further comprising:
    - feeding back the excitation generated from the first pulse-related information in addition to the pitch-related information, for generating an excitation for a succeeding sub-frame; and
      
      feeding back another excitation generated from the pitch-related information without the second pulse-related information, for generating an excitation for a succeeding sub-frame.

12. A speech processing system based on code excited linear prediction (CELP) for encoding a speech signal, wherein the speech signal is divided into frames and each frame is further divided into sub-frames, the system comprising:
- a generator of linear prediction coding (LPC) coefficients for a frame;
  
  a first portion including an adaptive codebook for generating pitch-related information for each sub-frame of the frame;
  
  a second portion including a fixed codebook for generating fixed-code pulse information for each sub-frame of the frame, the pulse-related information including first fixed-code pulse information for a first kind of sub-frame and second fixed-code pulse information for a second kind of sub-frame; and
  
  a parameter encoder for generating a basic bit-stream from the LPC coefficients, the pitch-related information, and the first fixed-code pulse information, and for generating enhancement bits from the second pulse-related information.
- View Dependent Claims (13, 14, 15)
- - 13. The system according to claim 12, further comprisinga transmitter for transmitting the basic bit-stream and a part of the enhancement bits onto a channel, the part being determined based on traffic of the channel.
  - 14. The system according to claim 12, wherein the pitch-related information is reused in the first portion for a succeeding sub-frame, the first fixed-code pulse information being reused in addition to the pitch-related information, the second fixed-code pulse information not being reused.
  - 15. The system according to claim 12, further comprising:
    - an analysis-by-synthesis loop including a synthesizer for searching the adaptive codebook and the fixed codebook to minimize a difference between a synthesized speech and a target signal; and
      
      a target signal processor for linearly attenuating a magnitude of samples in the target signal provided to the analysis-by-synthesis loop for the second kind of sub-frame, the number of samples corresponding to the order of an LP-synthesis filter.

16. A speech processing system based on code excited linear prediction (CELP) for synthesizing speech, wherein a speech signal is divided into frames and each frame is further divided into sub-frames, the system comprising:
- a parameter decoder for extractinglinear prediction coding (LPC) coefficients for a frame,pitch-related information for all the sub-frames of the frame, andfirst pulse-related information for a plurality of selected sub-frames of the frame,from a basic bit-stream received, andfor extracting a second pulse-related information for unselected sub-frames of the frame from enhancement bits received;
  
  a first portion including an adaptive codebook for generating an excitation based on the pitch-related information;
  
  a second portion including a fixed codebook for generating an excitationbased on the first pulse-related information orbased on the second pulse-related information; and
  
  a synthesizer for outputting synthesized speech according to the excitations and the LPC coefficients.
- View Dependent Claims (17, 18)
- - 17. The system according to claim 16, wherein the second pulse-related information includes information for a plurality of pulses, and the parameter decoder extracts, from the enhancement bits received, information for each pulse and provides the second portion with the information for each pulse.
  - 18. The system according to claim 16, wherein:
    - the excitation generated from the pitch-related information is fed back to the first portion for a succeeding sub-frame,the excitation generated from the first pulse-related information being fed back in addition to the excitation from the pitch-related information, andthe excitation generated from the second pulse-related information not being fed back.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Industrial Technology Research Institute
Original Assignee
Industrial Technology Research Institute
Inventors
Chen, Fang-Chu
Primary Examiner(s)
Chawan, Vijay B.

Application Number

US09/950,633
Publication Number

US 20020133335A1
Time in Patent Office

1,608 Days
Field of Search

704/219, 704/264, 704/238, 704/270, 704/267, 704/500, 704/220, 704/222, 704/200.1, 704/203, 704/207, 704/212, 704/223, 704/236, 704/229, 704/208, 704/214, 382/238
US Class Current

704/219
CPC Class Codes

G10L 19/10 the excitation function bei...

Celp-Based speech coding for fine grain scalability by altering sub-frame pitch-pulse

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Celp-Based speech coding for fine grain scalability by altering sub-frame pitch-pulse

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links