Method and apparatus for speech compression using multi-mode code excited linear predictive coding

US 5,602,961 A
Filed: 05/31/1994
Issued: 02/11/1997
Est. Priority Date: 05/31/1994
Status: Expired due to Term

First Claim

Patent Images

1. An apparatus for processing an input signal, said input signal including a frame, said apparatus comprising:

a first circuit coupled to receive a first signal, said first signal corresponding to said input signal, said first circuit for generating a first set of parameters corresponding to said frame;

a second circuit coupled to receive said first signal and said first set of parameters, said second circuit for generating a second signal;

a pulse train analyzer, coupled to said second circuit, said pulse train analyzer for generating a first match value, a second set of parameters, and a first excitation value;

a fourth circuit, coupled to said second circuit, said fourth circuit for generating a second match value, a third set of parameters, and a second excitation value, said fourth circuit including an adaptive codebook and an adaptive codebook analyzer, said adaptive codebook being coupled to said adaptive codebook analyzer;

a fifth circuit, coupled to said pulse train analyzer and said fourth circuit, for determining a set of admissible excitation search modes based upon a prior excitation search mode, and said fifth circuit further for selecting an excitation search mode from said set of admissible excitation search modes;

a sixth circuit, coupled to said fifth circuit, for selecting a selected set of parameters and a selected excitation corresponding to said excitation search mode, anda seventh circuit, coupled to said first circuit and said sixth circuit, for generating an encoded signal responsive to said selected set of parameters and said excitation search mode.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus and method of coding speech. The apparatus includes a first circuit being coupled to receive a first signal, the first signal corresponds to the speech signal. The first circuit is for generating a first set of parameters corresponding to the first frame. The apparatus includes a second circuit, being coupled to receive a second signal and the first set of parameters, the second signal corresponding to the speech signal, and the second circuit is for generating a third signal. The apparatus further includes a pulse train analyzer, being coupled to the second circuit, for generating a third match value, a third set of parameters, and a third excitation value. The apparatus further including a fourth circuit, being coupled to the second circuit, for generating a fourth match value, a fourth set of parameters, and a fourth excitation value. The apparatus further including a fifth circuit, being coupled to the third circuit and the fourth circuit, for selecting a mode corresponding to a match value. The apparatus further including a sixth circuit, being coupled to the fifth circuit, for selecting a selected set of parameters and a selected excitation corresponding to the mode. The apparatus further including a seventh circuit, being coupled to the first circuit and the sixth circuit, for generating an encoded signal responsive to the selected set of parameters and the mode.

Citations

26 Claims

1. An apparatus for processing an input signal, said input signal including a frame, said apparatus comprising:
- a first circuit coupled to receive a first signal, said first signal corresponding to said input signal, said first circuit for generating a first set of parameters corresponding to said frame;
  
  a second circuit coupled to receive said first signal and said first set of parameters, said second circuit for generating a second signal;
  
  a pulse train analyzer, coupled to said second circuit, said pulse train analyzer for generating a first match value, a second set of parameters, and a first excitation value;
  
  a fourth circuit, coupled to said second circuit, said fourth circuit for generating a second match value, a third set of parameters, and a second excitation value, said fourth circuit including an adaptive codebook and an adaptive codebook analyzer, said adaptive codebook being coupled to said adaptive codebook analyzer;
  
  a fifth circuit, coupled to said pulse train analyzer and said fourth circuit, for determining a set of admissible excitation search modes based upon a prior excitation search mode, and said fifth circuit further for selecting an excitation search mode from said set of admissible excitation search modes;
  
  a sixth circuit, coupled to said fifth circuit, for selecting a selected set of parameters and a selected excitation corresponding to said excitation search mode, anda seventh circuit, coupled to said first circuit and said sixth circuit, for generating an encoded signal responsive to said selected set of parameters and said excitation search mode.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The apparatus of claim 1 further comprising:
    - an eighth circuit, coupled to said second circuit, said eighth circuit for generating a third match value, a fourth set of parameters, and a third excitation value, andwherein, said fifth circuit is coupled to said eighth circuit.
  - 3. The apparatus of claim 2 wherein said eighth circuit further includes a stochastic codebook analyzer for generating said fourth set of parameters.
  - 4. The apparatus of claim 2 wherein said eighth circuit includes a trellis codebook analyzer for generating said fourth set of parameters.
  - 5. The apparatus of claim 2 wherein said first set of parameters includes linear prediction coefficients (LPCs) corresponding to said frame, and wherein said second circuit is coupled to receive said LPCs and is for performing ringing removal and perceptual weighting of said first signal to generate said second signal.
  - 6. The apparatus of claim 3 wherein each of said second, third, and fourth set of parameters includes an index parameter and a gain parameter.
  - 7. The apparatus of claim 4 wherein said frame includes a subframe, and wherein said second set of parameters corresponds to said subframe.
  - 8. The apparatus of claim 7 wherein said second set of parameters include a pitch parameter, an index parameter, and a phase parameter, and wherein the index parameter includes an index to a shape pulse.
  - 9. The apparatus of claim 7 wherein an index parameter of said third set of parameters includes an index to said adaptive codebook.
  - 10. The apparatus of claim 7 wherein said eighth circuit includes a short adaptive codebook.
  - 11. The apparatus of claim 7 wherein said fifth circuit is for weighting said first, second and third match values prior to selecting said excitation search mode.
  - 12. The apparatus of claim 11 wherein said first match value is weighted by an amount between 0.7-0.9, wherein said second match value is weighted by an amount between 1.1-1.3, and wherein said third match value is weighted by an amount between 0.8-1.0.
  - 13. The apparatus of claim 7 wherein said input signal includes a previous subframe, said previous subframe having said previous excitation search mode, and said fifth circuit is for selecting said excitation search mode responsive to said previous subframe.
  - 14. The apparatus of claim 7 wherein said input signal includes digitized speech.
  - 15. The apparatus of claim 7 further comprising a filter circuit coupled to receive said input signal and for generating said first signal.
  - 16. The apparatus of claim 7 further comprising a line spectrum pair circuit, being coupled to said first circuit and said seventh circuit, for generating line spectrum pair parameters from said first set of parameters, wherein said seventh circuit includes a multiplexing circuit, and wherein said seventh circuit is for multiplexing said line spectrum pair parameters with said selected set of parameters and said selected excitation.
  - 17. The apparatus of claim 2 wherein said fifth circuit is further configured to select said excitation search mode corresponding to one of said set of admissible excitation search modes requiring the least number of bits and complying with a predetermined error threshold.

18. A multi-mode linear predictive coder for processing digital speech signals, said digital speech signals being partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length, said coder comprising:
- a short-term prediction analyzer responsive to said digital speech signals, said short-term prediction analyzer for generating linear prediction parameters and line spectrum parameters;
  
  a variable rate encoder, coupled to said short-term prediction analyzer, for coding differences of said line spectrum parameters by a predetermined variable rate code;
  
  a ringing removal and perceptual weighting circuit for ringing removal and perceptual weighting said digital speech signals to produce predistorted speech vectors for successive subframes;
  
  a multi-mode excitation analyzer, coupled to said ringing removal and perceptual weighting circuit, for generating a set of excitations, a set of match values, and a set of parameters, each excitation in said set of excitations corresponding to a maximal value of a match function in said set of match values;
  
  a pause analyzer, responsive to said digital speech signals, for pause detecting and producing a pause mode signal;
  
  a comparator and controller, coupled to said multi-mode excitation analyzer and said pause analyzer, for weighting and comparing said match function values for each of a plurality of excitation search modes, and for generating a current excitation search mode corresponding to one of said plurality of excitation search modes with a maximal weighted match function value;
  
  a selector of parameters, coupled to said multi-mode excitation analyzer, for generating selected parameters from said set of parameters corresponding to said current excitation search mode; and
  
  a selector of excitations, coupled to said multi-mode excitation analyzer, for selecting a current excitation from said set of excitations corresponding to said current excitation search mode.
- View Dependent Claims (19, 20)
- - 19. The multi-mode linear predictive coder as recited in claim 18, wherein said multi-mode excitation analyzer further comprises:
    - an adaptive codebook (ACB) analyzer, coupled to said ringing removal and perceptual weighting circuit, for generating an ACB excitation, an ACB match function and ACB parameters for each subframe in said frame;
      
      a pulse train analyzer, coupled to said tinging removal and perceptual weighting circuit, for generating a pulse excitation, a pulse match function and pulse parameters;
      
      a shortened adaptive codebook (SACB) analyzer, coupled to said ringing removal and perceptual weighting circuit, for generating a SACB codebook excitation and SACB parameters; and
      
      a stochastic analyzer, coupled to said ringing removal and perceptual weighting circuit, said stochastic analyzer for generating a stochastic gain, a stochastic codeword index, a stochastic excitation, and a stochastic match function, said stochastic excitation corresponding to said SACB excitation.
  - 20. The multi-mode linear predictive coder of claim 19 wherein said stochastic analyzer is a trellis analyzer, and wherein said stochastic gain is a trellis gain, said stochastic codeword index is a trellis codeword index, said stochastic excitation is a trellis excitation, and said stochastic match function is a trellis match function.

21. A method of selecting encoding parameters, said method for use in a speech synthesizer to improve the subjective speech quality, said method comprising the steps of:
- constructing a pulse based upon the time inversion of a pulse response of a response filter;
  
  generating an excitation vector in the form of multiple pitch spaced pulses using a set of pitch values, a set of phase values, and said pulse, said set of pitch values and said set of phase values derived from a perceptually weighted speech signal;
  
  computing energy values and correlation values, said energy values determined using a filtered vector, said correlation values representing the correlation between said filtered vector and said perceptually weighted speech signal, said filtered vector corresponding to said excitation vector; and
  
  selecting the pulse excitation from said excitation vector corresponding to correlation values and energy values that maximize a pulse mode match function.
- View Dependent Claims (22, 23)
- - 22. The method of claim 21 wherein said method further comprises the step of receiving a set of linear prediction coefficients (LPCs), said LPCs defining a linear prediction (LP) analysis filter of order m, and said step of constructing a pulse uses the following equations:
    - space="preserve" listing-type="equation">A(z)=1-a.sub.1 z.sup.-1 -a.sub.2 z.sup.-2 - . . . -a.sub.m z.sup.-m ;
      space="preserve" listing-type="equation">U(z)=(1-δ
      
      z.sup.-1)/A(α
      
      z);
      space="preserve" listing-type="equation">V.sub.0,n-1 (z)=z.sup.n-1 U.sub.0,n-1 (z.sup.-1);
      space="preserve" listing-type="equation">W(z)=(V.sub.n-m,n-1 (z)+z.sup.-n U.sub.0,d (z))A(β
      
      z); and
      V_n,m-1 (Z)=W_n,M-1 (Z);
      
      where X_i,j (z) represents the polynomial X_i,j (z)=X_i z^-i +x_i+1 z^-(i+1) +. . . +x_j z^-j, j>
      
      i, where A(z) denotes the Z-transform for the LP analysis filter, where a_i represents one linear prediction coefficient of said set of LPCs, where samples of said pulse are represented by V_i (z), where n<
      
      M, where α and
      
      δ
      
      are empirically chosen constants, 0≦
      
      α
      
      ,δ
      
      ≦
      
      1, where β
      
      is an empirically chosen constant, 0≦
      
      β
      
      ≦
      
      1, and where d, d≧
      
      0, is a fixed constant.
  - 23. The method of claim 22 wherein α
    - is in the range 0.9 to 0.98, δ
      
      is in the range 0.55 to 0.75, and β
      
      is in the range 0.6 to 0.8.

24. A pulse train analyzer for use in a speech synthesizer comprising:
- a pulse generator coupled to receive a set of pitch values, a set of phase values, and a set of linear prediction coefficients (LPCs), said set of pitch values and said set of phase values derived from a perceptually weighted speech signal, said set of LPCs derived from an input speech signal, said pulse generator producing an excitation vector based upon said set of pitch values, said set of phase values, and said set of LPCs;
  
  a correlation circuit coupled to said pulse generator and further coupled to receive said perceptually weighted speech signal, said correlation circuit using a pulse mode match function to determine a set of match values, said set of match values based upon said excitation vector and said perceptually weighted speech signal; and
  
  a pulse train selector coupled to receive said set of match values, said pulse train selector selecting the excitation from said excitation vector that corresponds to the maximal value in said set of match values as a selected pulse excitation.
- View Dependent Claims (25, 26)
- - 25. The pulse train analyzer of claim 24 said correlation circuit further comprising:
    - a response filter coupled to said pulse generator producing a pulse response corresponding to said excitation vector;
      
      a correlator coupled to receive said perceptually weighted speech signal and coupled to said response filter, said correlator computing correlation values between said pulse response and said perceptually weighted speech signal;
      
      an energy calculator coupled to said response filter computing energy values using said pulse response; and
      
      a match function calculator coupled to said correlator and said energy calculator to produce said set of match values using said pulse mode match function, said set of match values based upon applying said pulse mode match function to said correlation values and said energy values.
  - 26. The pulse train analyzer of claim 25 said pulse generator further comprising:
    - a pulse train generator coupled to receive said set of pitch values and said set of phase values, said set of pitch values and said set of phase values derived from said perceptually weighted speech signal, said pulse train generator producing said excitation vector in the form of multiple pitch spaced pulses based upon said set of pitch values, said set of phase values, and a pulse; and
      
      a pulse shape generator coupled to said pulse train generator, said pulse shape generator producing a pulse using a formula corresponding to the time inversion of the pulse response.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
XVD Technology Holdings Ltd.
Original Assignee
Alaris, Inc., GT Technologies
Inventors
Krachkovsky, Victor Y., Kolesnik, Victor D., Kudryashov, Boris D., Bocharova, Irina E., Kovalov, Sergei I., Ovsjannikov, Eugeny P., Trofimov, Andrey N., Trojanovsky, Boris K.
Primary Examiner(s)
Tung, Kee M.

Application Number

US08/251,471
Time in Patent Office

987 Days
Field of Search

381/29, 381/30, 381/36-38, 381/51, 395/2.28-2.39, 395/2.67, 395/2.71-2.74
US Class Current

704/223
CPC Class Codes

G10L 19/12   the excitation function bei...

G10L 19/18   Vocoders using multiple modes

G10L 2019/0013   Codebook search algorithms

G10L 2025/935   Mixed voiced class; Transit...

G10L 25/24   the extracted parameters be...

Method and apparatus for speech compression using multi-mode code excited linear predictive coding

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for speech compression using multi-mode code excited linear predictive coding

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links