Code excited linear predictive vocoder

US 4,899,385 A
Filed: 06/26/1987
Issued: 02/06/1990
Est. Priority Date: 06/26/1987
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding speech using a plurality of candidate sets of excitation information stored in a table where said speech comprises frames of speech each frame having a plurality of samples, comprising the steps of:

storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from a previous candidate set by only a first and a second subset of excitation information where said first subset of excitation information comprises sequential samples from the beginning of each candidate set and said second subset of excitation information comprises sequential samples from the end of each candidate set;

forming a target set of excitation information in response to a present one of said frames of speech;

determining a set of filter coefficients in response to said present one of said frames of speech;

calculating information to model a finite impulse response filter from said set of filter coefficients;

recursively calculating an error value for each present one of said plurality of candidate sets of excitation information in response to the finite impulse response filter information and each of said candidate sets of excitation information and said target set of excitation information by removing a portion of the error value of said error value of said previous candidate set of excitation information contributed by said first subset of said excitation information of said previous candidate set of excitation information from said error value for said previous candidate set of excitation information to form a temporary error value and adding in a portion of error value of each present one of said candidate sets of excitation information contributed by said second subset of excitation information of each present one of said candidate sets of excitation information to said temporary error value to form an error value for each present one of said candidate sets of excitation information; and

selecting one of said candidate sets of excitation information whose calculated error value is the smallest;

determining a location in said table of said selected one of said candidate sets of excitation information;

communicating said set of filter coefficients and information representing said location of said selected one of said candidate sets of excitation information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Apparatus for encoding speech using a code excited linear predictive (CELP) encoder using a recursive computational unit. In response to a target excitation vector that models a present frame of speech, the computational unit utilizes a finite impulse response linear predictive coding (LPC) filter and an overlapping codebook to determine a candidate excitation vector from the codebook that matches the target excitation vector after searching the entire codebook for the best match. For each candidate excitation vector accessed from the overlapping codebook, only one sample of the accessed vector and one sample of the previously accessed vector must have arithmetic operations performed on them to evaluate the new vector rather than all of the samples as is normal for CELP methods. For increased performance, a stochastically excited linear predictive (SELP) encoder is used in series with the adaptive CELP encoder. The SELP encoder is responsive to the difference between the target excitation vector and the best matched candidate excitation vector to search its own overlapping codebook in a recursive manner to determine a candidate excitation vector that provides the best match. Both of the best matched candidate vectors are used in speech synthesis.

Citations

18 Claims

1. A method of encoding speech using a plurality of candidate sets of excitation information stored in a table where said speech comprises frames of speech each frame having a plurality of samples, comprising the steps of:
- storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from a previous candidate set by only a first and a second subset of excitation information where said first subset of excitation information comprises sequential samples from the beginning of each candidate set and said second subset of excitation information comprises sequential samples from the end of each candidate set;
  
  forming a target set of excitation information in response to a present one of said frames of speech;
  
  determining a set of filter coefficients in response to said present one of said frames of speech;
  
  calculating information to model a finite impulse response filter from said set of filter coefficients;
  
  recursively calculating an error value for each present one of said plurality of candidate sets of excitation information in response to the finite impulse response filter information and each of said candidate sets of excitation information and said target set of excitation information by removing a portion of the error value of said error value of said previous candidate set of excitation information contributed by said first subset of said excitation information of said previous candidate set of excitation information from said error value for said previous candidate set of excitation information to form a temporary error value and adding in a portion of error value of each present one of said candidate sets of excitation information contributed by said second subset of excitation information of each present one of said candidate sets of excitation information to said temporary error value to form an error value for each present one of said candidate sets of excitation information; and
  
  selecting one of said candidate sets of excitation information whose calculated error value is the smallest;
  
  determining a location in said table of said selected one of said candidate sets of excitation information;
  
  communicating said set of filter coefficients and information representing said location of said selected one of said candidate sets of excitation information.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 further comprises the steps of:
    - recursively calculating another error value for each of another plurality of candidate sets of excitation information stored in another table in response to the finite impulse response filter information and each of said candidate sets of said other table and said target set of excitation information and said selected set of excitation information from said table;
      
      selecting one of said other plurality of said candidate sets of excitation information from said other table whose other error value is the smallest; and
      
      determining a location in said other table of said selected one of said other plurality of said candidate sets of excitation information;
      
      further communicating information representing said location in said other table of said selected one of said candidate sets of excitation information in said other table.
  - 3. The method of claim 2 wherein said step of recursively calculating said other error value for each of said other plurality of candidate sets of excitation information comprises the step of subtracting said selected candidate set of excitation information from said target set of excitation information to form another target set of excitation information for use in calculating said other error value for each of said candidate sets of said other table.
  - 4. The method of claim 3 wherein each of said candidate sets of excitation information comprises a plurality of samples and said first subset is the first sample of said previous candidate set of excitation information and said second subset is the last sample of each of said candidate sets of excitation information.
  - 5. The method of claim 4 wherein said step of storing further comprises arranging said candidate sets of excitation information in said table in chronological order;
    - said method further comprises the step of adding said selected candidate set of excitation information from said table and said selected candidate set of excitation information from said other table to form a synthesis set of excitation information for said present frame; and
      
      updating said table with said synthesis set of excitation information by replacing the oldest candidate set of excitation information in said table.
  - 6. The method of claim 3 wherein said step of forming said target set of excitation information comprises the steps of adding said selected candidate set of excitation information from said table to said selected candidate set of excitation information from said other table to form a synthesis set of excitation information;
    - filtering in response to the filter coefficients for said previous frame said synthesis set of excitation information from said previous frame;
      
      zero-input response filtering in response to said filter coefficients for said previous frame the filtered synthesis set of excitation information to produce a ringing set of information;
      
      subtracting said ringing set of information from said present one of said frames of said speech for each of said candidate sets of excitation information to generate an intermediate set of information; and
      
      whitening filtering based on the filter coefficients for said present frame said intermediate set of information to form said target set of excitation information.

7. A method of encoding speech for communication to a decoder for reproduction, comprising the steps of:
- grouping said speech into frames of speech each frame being represented by a speech vector with each vector having a plurality of samples with each speech vector representing a portion of said speech;
  
  calculating a set of filter coefficients in response to a present one of said speech vectors;
  
  calculating a response matrix to model a finite impulse response filter based on said filter coefficients for said present speech vector;
  
  calculating a spectral weighting matrix of a Toeplitz form by matrix operations on said response matrix;
  
  calculating a ringing vector from the previous speech vector immediately preceding said present speech vector in time and said present speech vector;
  
  calculating a target vector in response to said present speech vector and said ringing vector;
  
  calculating a cross-correlation value in response to said target vector and said spectral weighting matrix and each of a plurality of candidate excitation vectors stored in an overlapping table;
  
  recursively calculating an energy value for each of said candidate excitation vectors in response to said target vector and said spectral weighting matrix and each of said candidate excitation vectors and said ringing vector by removing a contribution of the first sample of the previous candidate excitation vector of said table from the energy value calculated for said previous candidate excitation vector to form a temporary energy value and adding a contribution of the last sample of the present candidate excitation vector of said table to the temporary energy value to form said energy value for said present candidate excitation vector;
  
  calculating an error value for each of said candidate excitation vectors in response to each of said cross-correlation and energy values for each of said candidate excitation vectors;
  
  selecting the candidate excitation vector whose calculated error value is the smallest; and
  
  determining a location in said table of said selected candidate excitation vector;
  
  communicating information defining the determined location of said selected candidate excitation vector in said table and said filter coefficients.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7 wherein said step of calculating said cross-correlation value for each of said candidate excitation vectors further comprises the steps of:
    - forming a temporary vector by matrix operations between said spectral weighting matrix and said target excitation vector; and
      
      forming said cross-correlation value from each of said candidate excitation vectors and said temporary vector.
  - 9. The method of claim 7 further comprises the steps of:
    - calculating another target excitation vector in response to said target excitation vector and said selected candidate vector of said table;
      
      calculating another cross-correlation value in response to said other target vector and said spectral weighting matrix and each of a plurality of other candidate vectors stored in another overlapping table;
      
      recursively calculating another energy value in response to said other target vector and said spectral weighting matrix and each of said other candidate vectors from said other table;
      
      calculating another error value for each of said other candidate excitation vectors from said other table in response to each of said other cross-correlation and energy values for each of said other candidate excitation vectors of said other table;
      
      selecting the one of said other candidate excitation vectors from said other table whose other error value is the smallest; and
      
      further communicating information defining the location in said other table of the selected other candidate excitation vector.
  - 10. The method of claim 9 wherein a said step of:
    - calculating a target excitation vector further comprises the steps of;
      
      subtracting said ringing vector from said speech vector to generate an intermediate vector; and
      
      whitening filtering based on said filter coefficients of said present speech vector said intermediate vector to form said target excitation vector.
  - 11. The method of claim 10 wherein said step of calculating said ringing vector comprises the steps of:
    - adding said selected candidate excitation vector of said table to said selected other candidate excitation vector from said other table to form a synthesis excitation vector;
      
      filtering based on the filter coefficients for said previous speech vector said synthesis excitation vector from said previous speech vector; and
      
      zero-input response filtering based on said filter coefficients for said previous speech vector the filtered synthesis excitation vector to produce said ringing vector.
  - 12. The method of claim 11 wherein said plurality of candidate excitation vectors are stored in said table in a chronological order and said method further comprises the step of updating said table with said synthesis excitation vector for said present speech vector by replacing the oldest one of said candidate excitation vectors in said table.

13. Apparatus for encoding speech for communication to a decoder for reproduction and said speech comprises frames of speech each having a plurality of samples, comprising:
- means for forming a target set of excitation information in response to a present one of said frames of speech;
  
  means for determining a set of filter coefficients in response to said present one of said frames of speech;
  
  means for storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from the previous candidate set by only a first and a second subset of excitation information;
  
  means for calculating information to model a finite impulse response filter from said set of filter coefficients;
  
  means for recursively calculating an error value for each of said plurality of candidate sets of excitation information stored in said table in response to the finite impulse response filter information and each of said candidates sets of excitation information and said target set of excitation information by removing a contribution of said first subset of said excitation information from the error value for said previous candidate set of excitation information to form a temporary error value and adding in a contribution of said second subset of excitation information to said temporary error value to form said error value for said present candidate set of excitation information; and
  
  means for selecting one of said candidates of excitation information whose calculated error value in the smallest;
  
  means for determining a location in said table of said selected one of said candidates of excitation information;
  
  means for communicating said set of filter coefficients and information representing the determined location of said selected one of said candidate sets of excitation information.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The apparatus of claim 13 further comprises:
    - means for recursively calculating another error value for each of another plurality of candidate sets of excitation information stored in another table in response to the finite impulse response filter information and each of said candidate sets of said other table and said target set of excitation information and said selected set of excitation information from said table;
      
      means for selecting one of said other plurality of said candidate sets of excitation information from said other table whose other error value is the smallest; and
      
      means for determining a location in said other table of said selected one of said other plurality of said candidate sets of excitation information;
      
      said means for communicating further communicates information representing the determined location in said other table of said selected one of said candidate sets of excitation information in said other table.
  - 15. The apparatus of claim 14 wherein said means for recursively calculating said other error value comprises means for subtracting said selected candidate set of excitation information for each of said plurality of candidate sets of excitation information from said target set of excitation information to form another target set of excitation information for use in calculating said other error value for each of said candidate sets of said other table.
  - 16. The apparatus of claim 15 wherein each of said candidate sets of excitation information comprises a plurality of samples and said first subset is the first sample of said previous candidate set of excitation information and said second subset is the last sample of each of said candidate sets of excitation information.
  - 17. The apparatus of claim 16 wherein said plurality of candidate excitation vectors are stored in said table in a chronological order and the apparatus further comprises means for adding said selected candidate set of excitation information from said table and said selected candidate set of excitation information from said other table to form a synthesis set of excitation information for said present frame;
    - andmeans for updating said table with said synthesis set of excitation information by replacing the oldest candidate set of excitation information in said table.
  - 18. The apparatus of claim 15 wherein said means for forming said target set of excitation information comprises means for adding said selected candidate set of excitation information from said table to said selected candidate set of excitation information from said other table to form a synthesis set of excitation information;
    - means for filtering based on the filter coefficients for said previous frame said synthesis set of excitation information from said previous frame;
      
      means for zero-input response filtering based on said filter coefficients for said previous frame the filtered synthesis set of excitation information to produce a ringing set of information;
      
      means for subtracting said ringing set of information from said present one of said frames of said speech for each of said candidate sets of excitation information to generate an intermediate set of information; and
      
      means for whitening filtering based on the filter coefficients for said present frame said intermediate set of information to form said target set of excitation information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
American Telephone & Telegraph Company (AT&T, Inc.), Bell Telephone Laboratories, Inc. (Nokia Corporation)
Original Assignee
American Telephone & Telegraph Company (AT&T, Inc.), AT&T, Inc.
Inventors
Krasinski, Daniel J., Kleijn, Willem B., Ketchum, Richard H.
Primary Examiner(s)
Clark, David L.
Assistant Examiner(s)
Merecki, John A.

Application Number

US07/067,649
Time in Patent Office

956 Days
Field of Search

381/36-41, 381/29-32, 381/51, 364/513.5
US Class Current

704/223
CPC Class Codes

G10L 19/12   the excitation function bei...

G10L 2019/0002   Codebook adaptations

G10L 2019/0004   Design or structure of the ...

G10L 2019/0013   Codebook search algorithms

G10L 25/06   the extracted parameters be...

Code excited linear predictive vocoder

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Code excited linear predictive vocoder

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links