×

Vector adaptive predictive coder for speech and audio

  • US 4,969,192 A
  • Filed: 04/06/1987
  • Issued: 11/06/1990
  • Est. Priority Date: 04/06/1987
  • Status: Expired due to Term
First Claim
Patent Images

1. An improvement in the method for compressing digitally encoded input speech or audio vectors at a transmitter by using a scaling unit controlled by a quantized residual gain factor QG, a synthesis filter controlled by a set of quantized linear protective coefficient parameters QLPC, a pitch predictor controlled by pitch and pitch predictor parameters QP and QPP, a weighting filter controlled by a set of perceptual weighting parameters W, and a permanent indexed codebook containing a predetermined number M of codebook vectors, each having an assigned codebook index, to find an index which identifies the best match between an input speech or audio vector sn that is to be coded and a synthesized vector sn generated from a stored vector in said indexed codebook, wherein each of said digitally encoded input vectors consists of a predetermined number K of digitally coded samples, comprising the steps ofbuffering and grouping said input speech or audio vectors into frames of vectors with a predetermined number N of vectors in each frame,performing an initial analysis for each successive frame, said analysis including the computation of a residual gain factor G, a set of perceptual weighting parameters W, a pitch parameter P, a pitch predictor parameter PP, and a set of said linear predictive coefficient parameters LPC, and the computation of quantized values QG, QP, QPP and QLPC of parameters G, P, PP and LPC using one or more indexed quantizing tables for the computation of each quantized parameter or set of parametersfor each frame transmitting indices of said quantized parameters QG, QP, QPP and QLPC determined in the initial analysis step as side information about vectors analyzed for later use in looking up in one or more identical tables said quantized parameters QG, QP QPP and QLPC while reconstructing speech and audio vectors from encoded vectors in a frame, where each index for a quantized parameter points to a location in one or more of said identical tables where said quantized parameter may be found,computing a zero-state response vector from the vector output of a zero-input response filter comprising a scaling unit, synthesis filter and weighting filter identical in operation to said scaling unit, synthesis filter and weighting filter used for encoding said input vectors, said zero-state response vector being computed for each vector in said permanent codebook by first setting to zero the initial condition of said zero-input response filter so that the response computed is not influenced by a preceding one of said codebook vectors processed by said zero-input response filter, and the using said quanitized values of said residual gain factor, set of linear predictive coefficient parameters, and said set of perceptual weighting parameters computed in said initial analysis step by processing each vector in said permanent codebook through said zero-input response filter to compute a zero-state response vector, and storing each zero-state response vector computed in a zero-state response codebook at or together with an index corresponding to the index of said vector in said permanent codebook used for this zero-state response computation step, andafter thus performing an initial analysis of and computing a zero-state response codebook for each successive frame of input speech or audio vectors, encode each input vector sn of a frame in sequence by transmitting the codebook index of the vector in said permanent codebook which corresponds to the index of a zero-state response vector in said zero-state response codebook that best matches a vector vn obtained from an input vector sn bysubtracting a long term pitch prediction vector sn from the input vector sn to produce a difference vector dn and filtering said difference vector dn by said perceptual weighting filter to produce a final input vector fn, where said long term pitch prediction sn is computed by taking a vector from said permanent codebook at the address specified by the preceding particular index transmitted as a compressed vector code and performing gain scaling of this vector using said quantized gain factor QG, then synthesis filtering the vector obtained from said scaling using said quantized values QLPC of said set of linear predictive coefficient parameters to obtain a vector dn and from vector dn producing a long term pitch predicted vector sn of the next input vector sn through a pitch synthesis filter using said quantized values of pitch predictor parameters QP and QPP, said long term prediction vector sn being a prediction of the next input vector sn, andproducing said vector vn by subtracting from said final input vector fn the vector output of said zero-input response filter generated in response to a permanent codebook vector at the codebook address of the last transmitted index code, said vector output being generated by processing through said zero input response filter, said permanent codebook vector located at said last transmitted index code where the output of said zero input response filter is discarded while said permanent codebook vector located at said last transmitted index code is being processed sample by sample in sequence into said zero input response filter until all samples of said codebook vector have been entered, and where the input of said zero input response filter is interrupted after all samples of said codebook vector have been entered and then the desired vector output from said zero-input response filter is processed out sample by sample for subtraction from said final vector vn, andfor each input vector sn in a frame, finding the vector stored in said zero-state response codebook which best matches the vector vn, thereby finding the best match of a codebook vector with an input vector, using an estimate vector sn produced from the best match codebook vector found for the preceding input vector,having found the best match of said vector vn with a zero-state response vector in said zero-state response codebook for an input speech or audio vector sn, transmit the zero-state response codebook index of the current best-match zero-state response vector as a compressed vector code of the current input vector, and also use said index of the current best-match zero-state response vector to select a vector from said permanent codebook for computing said long term pitch predicted input vector sn to be subtracted from the next input vector sn of the frame.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×