Voicing measure as an estimate of signal periodicity for a frequency domain interpolative speech codec system
First Claim
Patent Images
1. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:
- a linear prediction (LP) front end, responsive to an input signal, providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual;
an open loop pitch estimator, responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;
a voice activity detector (VAD) mechanism responsive to said LP parameters and open loop pitch, generating a VAD flag for every predetermined interval;
a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal sub-intervals within the predetermined interval; and
said signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;
a separation of the normalized PW into a slowly evolving waveform (SEW) component and a rapidly evolving waveform (REW) component using a low-pass filter along every pitch harmonic track;
a representation of one or more of the components of the normalized PW in spectral magnitude-phase form; and
a characterization of the degree of periodicity of the input signal by a voicing measure, derived from certain parameters that are correlated to signal periodicity and computed from the input signal, PW, SEW and REW over the predetermined interval.
13 Assignments
0 Petitions
Accused Products
Abstract
A system determines a voicing measure as a measure of the degree of signal periodicity and uses the determined voicing measure to quantize the spectral magnitude of the slowly evolving waveform (SEW) and the modeling of the SEW and rapidly evolving waveform (REW) phase spectra.
-
Citations
12 Claims
-
1. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:
-
a linear prediction (LP) front end, responsive to an input signal, providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual;
an open loop pitch estimator, responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;
a voice activity detector (VAD) mechanism responsive to said LP parameters and open loop pitch, generating a VAD flag for every predetermined interval;
a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal sub-intervals within the predetermined interval; and
said signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;
a separation of the normalized PW into a slowly evolving waveform (SEW) component and a rapidly evolving waveform (REW) component using a low-pass filter along every pitch harmonic track;
a representation of one or more of the components of the normalized PW in spectral magnitude-phase form; and
a characterization of the degree of periodicity of the input signal by a voicing measure, derived from certain parameters that are correlated to signal periodicity and computed from the input signal, PW, SEW and REW over the predetermined interval.- View Dependent Claims (2, 3, 4, 5, 6, 7)
a neural network configured to determine the voicing measure with its input as the set of parameters which exhibit correlation to the degree of periodicity of the input signal.
-
-
5. A system as recited in claim 4, wherein a set of the neural network input parameters for voicing measure determination comprises the SEW variance, root-mean square value of SEW, and open loop pitch gain.
-
6. A system as recited in claim 4, wherein an auxiliary set of the neural network input parameters comprises a relative power level of the input signal, root-mean-square value of the REW component, a measure of peakiness of the prediction residual over a pitch cycle and the normalized autocorrelation coefficient of the input signal at unit lag.
-
7. A system as recited in claim 1, wherein said signal processor performs an error concealment procedure for the voicing measure to increase the robustness of the speech codec in the presence of transmission errors by computing a VAD likelihood measure based on previously received VAD flags, comprising:
-
a state machine relying on the correlation between the voicing measure and the VAD likelihood measure; and
a second state machine relying on the correlation between the root-mean-square value of SEW in a predetermined low frequency band and the voicing measure.
-
-
8. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:
-
a linear prediction (LP) front end responsive to an input signal, providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;
an open loop pitch estimator responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;
a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal sub-intervals within the predetermined interval; and
said signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;
a separation of the normalized PW into a slowly evolving waveform (SEW) component and a rapidly evolving waveform (REW) component using a low pass filter along every pitch harmonic track;
a characterization of the degree of periodicity of the input signal by a voicing measure, derived from certain parameters that are correlated to signal periodicity and computed from the input signal, PW, SEW and REW over the predetermined interval;
a representation of the SEW component in spectral magnitude-phase form and transmission of only the spectral magnitude information of the SEW component; and
a reconstruction of the SEW and REW phase components at the decoder using the received SEW and REW magnitude components, the voicing measure, and pitch frequency contour information.- View Dependent Claims (9, 10, 11, 12)
-
Specification