Low bit-rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization
First Claim
1. A system for coding speech, the speech being represented as plural speech samples segregated into a frame, the frame being formed of a plurality of subframes, wherein linear predictive coding (LPC) analysis and quantization of the speech samples in the frame are performed to determine an LPC residual signal, the system comprising:
- lag means for estimating an unquantized pitch lag value within a predetermined minimum-allowed pitch lag and a predetermined maximum-allowed pitch lag for each subframe within the frame;
means for obtaining a pitch lag vector comprising the unquantized pitch lag values for each subframe within the frame;
a vector quantizer for quantizing the pitch lag vector to generate a quantized pitch lag vector;
means for determining a pitch contribution vector for a current subframe, the pitch contribution vector being adapted to the quantized pitch lag vector;
codebook means for generating an excitation signal representative of the speech samples of the current subframe; and
means for applying the excitation signal of each current subframe to subsequent subframes to provide coded speech for the frame.
9 Assignments
0 Petitions
Accused Products
Abstract
A pitch lag coding device and method using interframe correlation inherent in pitch lag values to reduce coding bit requirements. A pitch lag value is extracted for a given speech frame, and then refined for each subframe. For every speech frame having N samples of speech, LPC analysis and vector quantization are performed for the whole coding frame. The LPC residual obtained for each frame is then processed such that pitch lag values for all subframes within the coding frame are analyzed concurrently. The remaining coding parameters, i.e., the codebook search, gain parameters, and excitation signal, are then analyzed sequentially according to their respective subframes.
-
Citations
16 Claims
-
1. A system for coding speech, the speech being represented as plural speech samples segregated into a frame, the frame being formed of a plurality of subframes, wherein linear predictive coding (LPC) analysis and quantization of the speech samples in the frame are performed to determine an LPC residual signal, the system comprising:
-
lag means for estimating an unquantized pitch lag value within a predetermined minimum-allowed pitch lag and a predetermined maximum-allowed pitch lag for each subframe within the frame;
means for obtaining a pitch lag vector comprising the unquantized pitch lag values for each subframe within the frame;
a vector quantizer for quantizing the pitch lag vector to generate a quantized pitch lag vector;
means for determining a pitch contribution vector for a current subframe, the pitch contribution vector being adapted to the quantized pitch lag vector;
codebook means for generating an excitation signal representative of the speech samples of the current subframe; and
means for applying the excitation signal of each current subframe to subsequent subframes to provide coded speech for the frame. - View Dependent Claims (2, 3, 4)
means for estimating an open-loop pitch lag value based on the LPC residual signal for the frame of speech;
means for generating an excitation vector representing speech samples of a first current subframe within the frame, including;
means for constructing an LPC residual signal vector, at least one filter for filtering the signal vector and to produce a target signal, and means for considering a pitch lag value within the predetermined minimum and maximum-allowed pitch lags, such that the excitation vector is obtained according to the past LPC residual signal and the considered pitch lag value; and
a perceptual filter for filtering the excitation vector to obtain a pitch prediction vector, wherein the unquantized pitch lag value is estimated according to the pitch prediction vector and the target signal.
-
-
3. The system of claim 1, wherein the codebook means comprises a codebook having plural codevectors individually representative of characteristics of the speech, each codevector having an associated gain, further wherein the codevector which best represents the speech samples in the current subframe is selected to generate the excitation signal.
-
4. The system of claim 3, further comprising:
-
means for transmitting the coded speech;
a decoder for receiving and processing the coded speech, the decoder including;
means for retrieving the vector quantized pitch lag, the pitch prediction coefficient, and the codevector and gain;
means for reverse quantizing the retrieved vector quantized pitch lag, the pitch prediction coefficient, and the codevector and gain to produce synthesized speech.
-
-
5. A system for coding speech, the speech being represented as plural speech samples segregated into a frame, the frame being formed of a plurality of subframes, wherein linear predictive coding (LPC) analysis and quantization of the speech samples in the frame are performed to determine an LPC residual signal r(n), the system comprising:
-
means for estimating an open-loop pitch lag value Lagop based on the LPC residual signal for the frame of speech;
means for generating a pitch prediction vector RLag representing speech samples of a first subframe within the frame, including;
means for constructing an LPC residual signal vector - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of coding input speech using pitch lag information, the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples, wherein the current LPC residual sample is determined in the time domain according to a linear combination of past LPC residual samples, further wherein the input speech has a pitch lag which falls within a minimum and maximum range of pitch lag values, the method comprising the steps of:
-
processing the input speech;
segregating N samples of the input speech into a frame, dividing the frame into a plurality of subframes, determining the LPC residual signal for each frame;
lag means for estimating an unquantized pitch lag value within the minimum and maximum range of pitch lags for each subframe within the frame based upon the LPC residual signal for the frame;
obtaining a pitch lag vector comprising the unquantized pitch lag values for each subframe within the frame;
generating a quantized pitch lag vector;
determining a pitch contribution vector for a current subframe, the pitch contribution vector being adapted to the quantized pitch lag vector;
generating an excitation signal representative of the speech samples of the current subframe; and
applying the excitation signal of each current subframe to subsequent subframes to provide coded speech for the frame. - View Dependent Claims (15, 16)
estimating an open-loop pitch lag value based on the LPC residual signal for the frame of speech;
generating an excitation vector representing speech samples of a first current subframe within the frame, including;
constructing an LPC residual signal vector, filtering the signal vector and to produce a target signal, and considering a pitch lag value within the predetermined minimum and maximum pitch lag range, such that the excitation vector is obtained according to a previous LPC residual signal and the considered pitch lag value; and
filtering the excitation vector to obtain a pitch prediction vector, wherein the unquantized pitch lag value is estimated according to the pitch prediction vector and the target signal.
-
-
16. The method of claim 14, further comprising:
-
transmitting the coded speech;
decoding the coded speech, including the steps of;
receiving and processing the coded speech, retrieving the vector quantized pitch lag and the pitch prediction coefficient, reverse quantizing the retrieved vector quantized pitch lag and the pitch prediction coefficient to produce synthesized speech.
-
Specification