Frequency domain interpolative speech codec system

US 6,418,408 B1
Filed: 04/04/2000
Issued: 07/09/2002
Est. Priority Date: 04/05/1999
Status: Expired due to Term

First Claim

Patent Images

1. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:

a linear prediction (LP) front end responsive to an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;

an open loop pitch estimator responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;

a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal subintervals within the predetermined interval;

said signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;

a low pass filter and a decimator for the PW gain sequence, yielding a decimated PW gain vector;

vector quantizer (VQ) operating on the decimated PW gain vector using a codebook comprising a section representative of steady state gain inputs and a section representative of transient gain inputs.

View all claims

13 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Encoding of prototype waveform components applicable to GeoMobile and Telephony Earth Station (TES) providing improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates the codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions. The rapidly evolving waveform (REW) and slowly evolving waveform (SEW) component vectors are converted to magnitude-phase. The variable dimension SEW magnitude vector is quantized using a hierarchical approach, i.e., a fixed dimension SEW mean vector computed by a sub-band averaging of SEW magnitude spectrum, and only the REW magnitude is explicitly encoded. The REW magnitude vector sequence is normalized to unity RMS value, resulting in a REW magnitude shape vector and a REW gain vector. The normalized REW magnitude vectors are modeled by a multi-band sub-band model which converts the variable dimension REW magnitude shape vectors, e.g., six dimensional REW sub-band vectors. The sub-band vectors are averaged over time, resulting in a single average REW sub-band vector for each frame. At the decoder, the full-dimension REW magnitude shape vector is obtained from the REW sub-band vector by a piecewise-constant construction. The REW phase vector is regenerated at the decoder based on the received REW gain vector and the voicing measure, which determines a weighted mixture of SEW component and a random noise that is passed through a high pass filter to generate the REW component. The high pass filter poles are adjusted based on the voicing measure to control the REW component characteristics. At the output the filter, the magnitude of the REW component is scaled to match the received REW magnitude vector.

140 Citations

13 Claims

1. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:
- a linear prediction (LP) front end responsive to an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal;
  
  an open loop pitch estimator responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;
  
  a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal subintervals within the predetermined interval;
  
  said signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;
  
  a low pass filter and a decimator for the PW gain sequence, yielding a decimated PW gain vector;
  
  vector quantizer (VQ) operating on the decimated PW gain vector using a codebook comprising a section representative of steady state gain inputs and a section representative of transient gain inputs.
- View Dependent Claims (2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. A system as recited in claim 1, wherein said signal processor provides error concealment for the PW gain in the coding system by decaying an average measure of PW gain obtained from two or more predetermined intervals, and increasing the rate of decay with the number of erased frames.
  - 4. A system as recited in claim 2, comprising:
5. A system as recited in claim 4, comprising:
- a low-pass filter for extracting a slowly evolving waveform (SEW) from the prototype waveform along each pitch harmonic track;
  
  a high-pass filter for extracting a rapidly evolving waveform (REW) from the prototype waveform along each pitch harmonic track; and
  
  vector quantizer for quantizing the SEW spectral magnitude vector using a mean-gain-shape method.
6. A system as recited in claim 5, comprising a vector quantizer for quantizing the REW spectral magnitude vector using a gain sub-band averaged shape method.
7. A system as recited in claim 6, wherein the SEW phase component is reconstructed at the decoder for every sub-interval of a predetermined interval based on the received voicing measure, pitch contour, SEW and REW magnitudes.
8. A system as recited in claim 7, wherein the REW phase component is reconstructed at the decoder for every sub-interval as the phase of the complex output of an adaptive filter, driven by a weighted combination of the complex SEW signal and a complex random noise process with the same energy as the SEW.
9. A system as recited in claim 8, wherein said decoder generates an excitation signal derived by conversion to time-domain of the gain scaled sum of the reconstructed SEW and REW components;
- and wherein said signal processor reconstructs speech as the output of the adaptively bandwidth broadened LP synthesis filter, driven by said excitation signal, further comprising a filter for postfiltering the reconstructed speech using a global pole-zero postfilter, whose parameters are derived from adaptively bandwidth broadened LP synthesis filter parameters.
10. A system as recited in claim 9, wherein said decoder generates an error concealment mechanism for the line spectral frequency (LSF) parameters based on replacing the errored parameters by ones generated using a higher value for the fixed prediction coefficient in the predictive inverse-VQ;
- and provides an error recovery mechanism whereby the LSF parameters of the previous frame are also replaced by an average of the parameters of the current frame and parameters from two frames ago, so that the LSF parameters evolve smoothly.
11. A system as recited in claim 10, wherein said decoder generates an error concealment mechanism for the open loop pitch parameter based on repetition of the pitch value of the previous frame;
- and provides an error recovery mechanism based on either repetition or averaging to obtain the pitch value of the previous frame, depending on the number of bad frames that have elapsed.
12. A system as recited in claim 11, wherein said decoder generates an error concealment for the PW gain in the coding system by decaying an average measure of PW gain obtained from two or more predetermined intervals, and increasing the rate of decay with the number of erased frames;
- and provides an error recovery mechanism.
13. A system as recited in claim 12, wherein said decoder provides an error concealment mechanism for the VAD likelihood measure by setting the VAD flag for the most recently received frames to indicate active speech, thereby reducing the degree of adaptive bandwidth broadening.

3. A frequency domain interpolative coding system for low bit-rate coding of speech signals, comprising:
- a linear prediction (LP) front end responsive to an input signal providing parameters which are quantized using a backward adaptive predictive multi-stage VQ for each predetermined interval and used to compute a LP residual signal;
  
  an open loop pitch estimator responsive to said LP residual signal, a pitch quantizer, and a pitch interpolator yielding a pitch contour within the predetermined interval;
  
  a signal processor responsive to said LP residual signal and the pitch contour for extracting a prototype waveform (PW) for a number of equal sub-intervals within the predetermined interval;
  
  signal processor computing a PW gain for generating a normalized PW for each sub-interval and a PW gain vector for the predetermined interval;
  
  a low pass filter and a decimator for the PW gain sequence, yielding a decimated PW gain vector;
  
  a vector quantizer (VQ) operating on the decimated PW gain vector using a codebook comprising a section representative of steady state gain inputs and a section representative of transient gain inputs.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hughes Network Systems LLC (Echostar Corporation)
Original Assignee
Hughes Electronics Corporation (Rtx Corporation)
Inventors
Swaminathan, Kumar, Zakaria, Gaguk, Udaya Bhaskar, Bangalore R., Nandkumar, Srinivas
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/542,792
Time in Patent Office

826 Days
Field of Search

704/219, 704/220, 704/221, 704/222, 704/223, 704/224, 704/225
US Class Current

704/219
CPC Class Codes

G10L 19/005   Correction of errors induce...

G10L 19/02   using spectral analysis, e....

G10L 19/0204   using subband decomposition

G10L 19/04   using predictive techniques

G10L 19/083   the excitation function bei...

G10L 19/09   Long term prediction, i.e. ...

G10L 19/18   Vocoders using multiple modes

G10L 2019/0012   Smoothing of parameters of ...

G10L 2025/783   based on threshold decision

G10L 25/27   characterised by the analys...

G10L 25/30   using neural networks

G10L 25/78   Detection of presence or ab...

G10L 25/90   Pitch determination of spee...

Frequency domain interpolative speech codec system

First Claim

13 Assignments

0 Petitions

Accused Products

Abstract

140 Citations

13 Claims

Specification

Use Cases

Quick Links

Others

Frequency domain interpolative speech codec system

First Claim

13 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

140 Citations

13 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others