Code excited linear predictive vocoder
First Claim
1. A method of encoding speech using a plurality of candidate sets of excitation information stored in a table where said speech comprises frames of speech each frame having a plurality of samples, comprising the steps of:
- storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from a previous candidate set by only a first and a second subset of excitation information where said first subset of excitation information comprises sequential samples from the beginning of each candidate set and said second subset of excitation information comprises sequential samples from the end of each candidate set;
forming a target set of excitation information in response to a present one of said frames of speech;
determining a set of filter coefficients in response to said present one of said frames of speech;
calculating information to model a finite impulse response filter from said set of filter coefficients;
recursively calculating an error value for each present one of said plurality of candidate sets of excitation information in response to the finite impulse response filter information and each of said candidate sets of excitation information and said target set of excitation information by removing a portion of the error value of said error value of said previous candidate set of excitation information contributed by said first subset of said excitation information of said previous candidate set of excitation information from said error value for said previous candidate set of excitation information to form a temporary error value and adding in a portion of error value of each present one of said candidate sets of excitation information contributed by said second subset of excitation information of each present one of said candidate sets of excitation information to said temporary error value to form an error value for each present one of said candidate sets of excitation information; and
selecting one of said candidate sets of excitation information whose calculated error value is the smallest;
determining a location in said table of said selected one of said candidate sets of excitation information;
communicating said set of filter coefficients and information representing said location of said selected one of said candidate sets of excitation information.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus for encoding speech using a code excited linear predictive (CELP) encoder using a recursive computational unit. In response to a target excitation vector that models a present frame of speech, the computational unit utilizes a finite impulse response linear predictive coding (LPC) filter and an overlapping codebook to determine a candidate excitation vector from the codebook that matches the target excitation vector after searching the entire codebook for the best match. For each candidate excitation vector accessed from the overlapping codebook, only one sample of the accessed vector and one sample of the previously accessed vector must have arithmetic operations performed on them to evaluate the new vector rather than all of the samples as is normal for CELP methods. For increased performance, a stochastically excited linear predictive (SELP) encoder is used in series with the adaptive CELP encoder. The SELP encoder is responsive to the difference between the target excitation vector and the best matched candidate excitation vector to search its own overlapping codebook in a recursive manner to determine a candidate excitation vector that provides the best match. Both of the best matched candidate vectors are used in speech synthesis.
-
Citations
18 Claims
-
1. A method of encoding speech using a plurality of candidate sets of excitation information stored in a table where said speech comprises frames of speech each frame having a plurality of samples, comprising the steps of:
-
storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from a previous candidate set by only a first and a second subset of excitation information where said first subset of excitation information comprises sequential samples from the beginning of each candidate set and said second subset of excitation information comprises sequential samples from the end of each candidate set; forming a target set of excitation information in response to a present one of said frames of speech; determining a set of filter coefficients in response to said present one of said frames of speech; calculating information to model a finite impulse response filter from said set of filter coefficients; recursively calculating an error value for each present one of said plurality of candidate sets of excitation information in response to the finite impulse response filter information and each of said candidate sets of excitation information and said target set of excitation information by removing a portion of the error value of said error value of said previous candidate set of excitation information contributed by said first subset of said excitation information of said previous candidate set of excitation information from said error value for said previous candidate set of excitation information to form a temporary error value and adding in a portion of error value of each present one of said candidate sets of excitation information contributed by said second subset of excitation information of each present one of said candidate sets of excitation information to said temporary error value to form an error value for each present one of said candidate sets of excitation information; and selecting one of said candidate sets of excitation information whose calculated error value is the smallest; determining a location in said table of said selected one of said candidate sets of excitation information; communicating said set of filter coefficients and information representing said location of said selected one of said candidate sets of excitation information. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of encoding speech for communication to a decoder for reproduction, comprising the steps of:
-
grouping said speech into frames of speech each frame being represented by a speech vector with each vector having a plurality of samples with each speech vector representing a portion of said speech; calculating a set of filter coefficients in response to a present one of said speech vectors; calculating a response matrix to model a finite impulse response filter based on said filter coefficients for said present speech vector; calculating a spectral weighting matrix of a Toeplitz form by matrix operations on said response matrix; calculating a ringing vector from the previous speech vector immediately preceding said present speech vector in time and said present speech vector; calculating a target vector in response to said present speech vector and said ringing vector; calculating a cross-correlation value in response to said target vector and said spectral weighting matrix and each of a plurality of candidate excitation vectors stored in an overlapping table; recursively calculating an energy value for each of said candidate excitation vectors in response to said target vector and said spectral weighting matrix and each of said candidate excitation vectors and said ringing vector by removing a contribution of the first sample of the previous candidate excitation vector of said table from the energy value calculated for said previous candidate excitation vector to form a temporary energy value and adding a contribution of the last sample of the present candidate excitation vector of said table to the temporary energy value to form said energy value for said present candidate excitation vector; calculating an error value for each of said candidate excitation vectors in response to each of said cross-correlation and energy values for each of said candidate excitation vectors; selecting the candidate excitation vector whose calculated error value is the smallest; and determining a location in said table of said selected candidate excitation vector; communicating information defining the determined location of said selected candidate excitation vector in said table and said filter coefficients. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. Apparatus for encoding speech for communication to a decoder for reproduction and said speech comprises frames of speech each having a plurality of samples, comprising:
-
means for forming a target set of excitation information in response to a present one of said frames of speech; means for determining a set of filter coefficients in response to said present one of said frames of speech; means for storing said candidate sets of excitation information in a table in an overlapping manner whereby each candidate set differs from the previous candidate set by only a first and a second subset of excitation information; means for calculating information to model a finite impulse response filter from said set of filter coefficients; means for recursively calculating an error value for each of said plurality of candidate sets of excitation information stored in said table in response to the finite impulse response filter information and each of said candidates sets of excitation information and said target set of excitation information by removing a contribution of said first subset of said excitation information from the error value for said previous candidate set of excitation information to form a temporary error value and adding in a contribution of said second subset of excitation information to said temporary error value to form said error value for said present candidate set of excitation information; and means for selecting one of said candidates of excitation information whose calculated error value in the smallest; means for determining a location in said table of said selected one of said candidates of excitation information; means for communicating said set of filter coefficients and information representing the determined location of said selected one of said candidate sets of excitation information. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification