Prototype waveform phase modeling for a frequency domain interpolative speech codec system
First Claim
1. A frequency domain interpolative CODEC system for low bit rate coding of speech, comprising:
- a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;
an open loop pitch estimator adapted to process said LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals; and
a signal processor responsive to said LP residual signal and the pitch contour and adapted to perform the following;
provide a voicing measure, said voicing measure characterizing a degree of voicing of said input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals;
extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals;
normalize the PW by a gain value of said PW;
encode a magnitude of said PW; and
reconstruct a nonstationarity component of a PW phase at a decoder every subinterval using only a received PW magnitude, a stationary component of said PW, said voicing measure, a PW subband nonstationarity measure and a pitch frequency contour information;
wherein a ratio is computed comparing the ratio of the energy of the nonstationarity component of the PW to that of the stationary component of the PW which is averaged over five PW subbands.
13 Assignments
0 Petitions
Accused Products
Abstract
A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal that provides LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals is also provided. Also provided is a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following: provide a voicing measure, where the voicing measure characterizes a degree of voicing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals; extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and separate stationary and nonstationary components of the PW using a low complexity alignment process and a filtering process that introduce no delay. The ratio of the energy of the nonstationary component of the PW to that of the stationary component of the PW is averaged across 5 subbands to compute the nonstationarity measure as a frequency dependent vector entity. A measure of the degree of voicing of the residual is also computed using openloop pitchgain, pitch variance, relative signal power, PW correlation and PW nonstationarity in low frequency subbands. The nonstationarity measure and voicing measure are encoded using a 6-bit spectrally weighted vector quantization scheme using a codebook partitioned based on a voiced/unvoiced decision. At the decoder, a stationary component of PW is reconstructed as a weighted combination of the previous PW phase vector, a random phase perturbation and a fixed phase vector obtained from a voiced pitch pulse.
-
Citations
21 Claims
-
1. A frequency domain interpolative CODEC system for low bit rate coding of speech, comprising:
-
a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;
an open loop pitch estimator adapted to process said LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals; and
a signal processor responsive to said LP residual signal and the pitch contour and adapted to perform the following;
provide a voicing measure, said voicing measure characterizing a degree of voicing of said input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals;
extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of said PW;
encode a magnitude of said PW; and
reconstruct a nonstationarity component of a PW phase at a decoder every subinterval using only a received PW magnitude, a stationary component of said PW, said voicing measure, a PW subband nonstationarity measure and a pitch frequency contour information;
wherein a ratio is computed comparing the ratio of the energy of the nonstationarity component of the PW to that of the stationary component of the PW which is averaged over five PW subbands. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
said PW subband nonstationarity measure.
-
-
9. A system as recited in claim 8, wherein a rate of randomization of a random phase perturbation of said PW is controlled by a pitch frequency contour.
-
10. A system as recited in claim 9, wherein a range of said random phase perturbation is controlled by said received voicing measure and said PW subband nonstationarity measure.
-
11. A system as recited in claim 10, wherein said reconstructed stationary component of said PW magnitude and PW phase model is further processed every subinterval.
-
12. A system as recited in claim 11, wherein said further processing further comprises:
-
low pass filtering said reconstructed stationary component to reduce excessive variations and to extract a stationary component of the PW; and
preserving the PW magnitude after said filtering process.
-
-
13. A frequency domain interpolative CODEC system for low bit rate coding of speech, comprising:
-
a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;
an open loop pitch estimator adapted to process said LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals;
a signal processor responsive to said LP residual signal and the pitch contour and adapted to perform the following;
provide a voicing measure, said voicing measure characterizing a degree of voicing of said input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals;
extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals;
normalize the PW by a gain value of said PW;
encode a magnitude of said PW; and
reconstruct a nonstationarity component of a PW phase at a decoder every subinterval using only a received PW magnitude, a stationary component of said PW, said voicing measure, a PW subband nonstationarity measure and a pitch frequency contour information;
wherein a ratio is computed comparing the ratio of the energy of the nonstationarity component of the PW to that of the stationary component of the PW which is averaged over five PW subbands. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
Specification