Encoding of periodic speech using prototype waveforms
First Claim
1. A method for coding and decoding a quasi-periodic speech signal that is transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising the steps of:
- extracting a current prototype from a current frame of the residual signal;
calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
transmitting said first set of parameters and said second set of parameters to the receiver;
forming a reconstructed current prototype at the receiver based on said first set of parameters, said second set of parameters, and a reconstructed previous prototype;
interpolating over the region between said reconstructed current prototype and said reconstructed previous prototype to form an interpolated residual signal; and
synthesizing an output speech signal based on said interpolated residual signal.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for coding a quasi-periodic speech signal. The speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter. The residual signal is encoded by extracting a prototype period from a current frame of the residual signal. A first set of parameters is calculated which describes how to modify a previous prototype period to approximate the current prototype period. One or more codevectors are selected which, when summed, approximate the error between the current prototype period and the modified previous prototype. A multi-stage codebook is used to encode this error signal. A second set of parameters describe these selected codevectors. The decoder synthesizes an output speech signal by reconstructing a current prototype period based on the first and second set of parameters, and the previous reconstructed prototype period. The residual signal is then interpolated over the region between the current and previous reconstructed prototype periods. The decoder synthesizes output speech based on the interpolated residual signal.
125 Citations
27 Claims
-
1. A method for coding and decoding a quasi-periodic speech signal that is transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising the steps of:
-
extracting a current prototype from a current frame of the residual signal;
calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
transmitting said first set of parameters and said second set of parameters to the receiver;
forming a reconstructed current prototype at the receiver based on said first set of parameters, said second set of parameters, and a reconstructed previous prototype;
interpolating over the region between said reconstructed current prototype and said reconstructed previous prototype to form an interpolated residual signal; and
synthesizing an output speech signal based on said interpolated residual signal. - View Dependent Claims (2, 3, 4)
-
-
5. A method for coding a quasi-periodic speech signal, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising the steps of:
-
extracting a current prototype from a current frame of the residual signal;
calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
reconstructing a current prototype based on said first and second set of parameters;
interpolating the residual signal over the region between said current reconstructed prototype and a previous reconstructed prototype; and
synthesizing an output speech signal based on said interpolated residual signal, wherein said step of calculating a first set of parameters comprises the steps of;
(i) circularly filtering said current prototype, forming a target signal;
(ii) extracting said previous prototype;
(iii) warping said previous prototype such that the length of said previous prototype is equal to the length of said current prototype;
(iv) circularly filtering said warped previous prototype; and
(v) calculating an optimum rotation and a first optimum gain, wherein said filtered warped previous prototype rotated by said optimum rotation and scaled by said first optimum gain best approximates said target signal. - View Dependent Claims (6, 7, 8, 9, 10, 11)
(i) updating said target signal by subtracting said filtered warped previous prototype rotated by said optimum rotation and scaled by said first optimum gain;
(ii) partitioning said first codebook into a plurality of regions, wherein each of said regions forms a codevector;
(iii) circularly filtering each of said codevectors;
(iv) selecting one of said filtered codevectors which most closely approximates said updated target signal, wherein said particular codevector is described by an optimum index;
(v) calculating a second optimum gain based on the correlation between said updated target signal and said selected filtered codevector;
(vi) updating said target signal by subtracting said selected filtered codevector scaled by said second optimum gain; and
(vii)repeating steps (iv)-(vi) for each of said stages in said first codebook, wherein said second set of parameters comprises said optimum index and said second optimum gain for each of said stages.
-
-
9. The method of claim 8, wherein said step of reconstructing a current prototype comprises the steps of:
-
(i) warping a previous reconstructed prototype such that the length of said previous reconstructed prototype is equal to the length of said current reconstructed prototype;
(ii)rotating said warped previous reconstructed prototype by said optimum rotation and scaling by said first optimum gain, thereby forming said current reconstructed prototype;
(iii)retrieving a second codevector from a second codebook, wherein said second codevector is identified by said optimum index, and wherein said second codebook comprises a number of stages equal to said first codebook;
(iv) scaling said second codevector by said second optimum gain;
(v) adding said scaled second codevector to said current reconstructed prototype; and
(vi)repeating steps (iii)-(v) for each of said stages in said second codebook.
-
-
10. The method of claim 9, wherein said step of interpolating the residual signal comprises the steps of:
-
(i) calculating an optimal alignment between said warped previous reconstructed prototype and said current reconstructed prototype;
(ii) calculating an average lag between said warped previous reconstructed prototype and said current reconstructed prototype based on said optimal alignment; and
(iii) interpolating said warped previous reconstructed prototype and said current reconstructed prototype, thereby forming the residual signal over the region between said warped previous reconstructed prototype and said current reconstructed prototype, wherein said interpolated residual signal has said average lag.
-
-
11. The method of claim 10, wherein said step of synthesizing an output speech signal comprises the step of filtering said interpolated residual signal with an LPC synthesis filter.
-
12. A method for coding and decoding a quasi-periodic speech signal that is transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising the steps of:
-
extracting a current prototype from a current frame of the residual signal;
calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
transmitting said first set of parameters and said second set of parameters to the receiver;
forming a reconstructed current prototype based on said first set of parameters, said second set of parameters and a reconstructed previous prototype;
filtering said reconstructed current prototype with an LPC synthesis filter;
filtering said previous reconstructed prototype with said LPC synthesis filter;
interpolating over the region between said filtered reconstructed current prototype and said filtered reconstructed previous prototype, thereby forming an output speech signal.
-
-
13. A system for coding and decoding a quasi-periodic speech signal that is transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising:
-
means for extracting a current prototype from a current frame of the residual signal;
means for calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
means for selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
means for transmitting said first set of parameters and said second set of parameters to the receiver;
means for forming a reconstructed current prototype based on said first set of parameters, said second set of parameters, and a reconstructed previous prototype;
means for interpolating over the region between said reconstructed current prototype and said reconstructed previous prototype to form an interpolated residual signal; and
means for synthesizing an output speech signal based on said interpolated residual signal. - View Dependent Claims (14, 15, 16)
-
-
17. A system for coding a quasi-periodic speech signal, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising:
-
means for extracting a current prototype from a current frame of the residual signal;
means for calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
means for selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
means for reconstructing a current reconstructed prototype based on said first and second set of parameters;
means for interpolating the residual signal over the region between said current reconstructed prototype and a previous reconstructed prototype;
means for synthesizing an output speech signal based on said interpolated residual signal, wherein said means for calculating a first set of parameters comprises;
a first circular LPC synthesis filter, coupled to receive said current prototype and to output a target signal;
means for extracting said previous prototype from a previous frame;
a warping filter, coupled to receive said previous prototype, wherein said warping filter outputs a warped previous prototype having a length equal to the length of said current prototype;
a second circular LPC synthesis filter, coupled to receive said warped previous prototype, wherein said second circular LPC synthesis filter outputs a filtered warped previous prototype; and
means for calculating an optimum rotation and a first optimum gain, wherein said filtered warped previous prototype rotated by said optimum rotation and scaled by said first optimum gain best approximates said target signal. - View Dependent Claims (18, 19, 20, 21, 22, 23)
means for updating said target signal by subtracting said filtered warped previous prototype rotated by said optimum rotation and scaled by said first optimum gain;
means for partitioning said first codebook into a plurality of regions, wherein each of said regions forms a codevector;
a third circular LPC synthesis filter coupled to receive said codevectors, wherein said third circular LPC synthesis filter outputs filtered codevectors;
means for calculating an optimum index and a second optimum gain for each stage in said first codebook, comprising;
means for selecting one of said filtered codevectors, wherein said selected filtered codevector most closely approximates said target signal and is described by an optimum index, means for calculating a second optimum gain based on the correlation between said target signal and said selected filtered codevector, and means for updating said target signal by subtracting said selected filtered codevector scaled by said second optimum gain;
wherein said second set of parameters comprises said optimum index and said second optimum gain for each of said stages.
-
-
21. The system of claim 20, wherein said means for reconstructing a current prototype comprises:
-
a second warping filter, coupled to receive a previous reconstructed prototype, wherein said second warping filter outputs a warped previous reconstructed prototype having a length equal to the length of said current reconstructed prototype;
means for rotating said warped previous reconstructed prototype by said optimum rotation and scaling by said first optimum gain, thereby forming said current reconstructed prototype; and
means for decoding said second set of parameters, wherein a second codevector is decoded for each stage in a second codebook having a number of stages equal to said first codebook, comprising;
means for retrieving said second codevector from said second codebook, wherein said second codevector is identified by said optimum index, means for scaling said second codevector by said second optimum gain, and means for adding said scaled second codevector to said current reconstructed prototype.
-
-
22. The system of claim 21, wherein said means for interpolating the residual signal comprises:
-
means for calculating an optimal alignment between said warped previous reconstructed prototype and said current reconstructed prototype;
means for calculating an average lag between said warped previous reconstructed prototype and said current reconstructed prototype based on said optimal alignment; and
means for interpolating said warped previous reconstructed prototype and said current reconstructed prototype, thereby forming the residual signal over the region between said warped previous reconstructed prototype and said current reconstructed prototype, wherein said interpolated residual signal has said average lag.
-
-
23. The system of claim 22, wherein said means for synthesizing an output speech signal comprises an LPC synthesis filter.
-
24. A system for coding and decoding a quasi-periodic speech signal that is transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, comprising:
-
means for extracting a current prototype from a current frame of the residual signal;
means for calculating a first set of parameters which describe how to modify a previous prototype such that said modified previous prototype approximates said current prototype;
means for selecting one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters;
means for transmitting said first set of parameters and said second set of parameters to the receiver;
means for forming a reconstructed current prototype based on said first set of parameters, said second set of parameters, and a reconstructed previous prototype;
a first LPC synthesis filter, coupled to receive said reconstructed current prototype, wherein said first LPC synthesis filter outputs a filtered reconstructed current prototype;
a second LPC synthesis filter, coupled to receive a reconstructed previous prototype, wherein said second LPC synthesis filter outputs a filtered reconstructed previous prototype; and
means for interpolating over the region between said filtered reconstructed current prototype and said filtered reconstructed previous prototype, thereby forming an output speech signal.
-
-
25. A method for reducing the transmission bit rate of a speech signal, comprising:
-
extracting a current prototype waveform from a current frame of the speech signal;
comparing the current prototype waveform to a past prototype waveform from a past frame of the speech signal, wherein a set of rotational parameters is determined that modifies the past prototype waveform to approximate the current prototype waveform and a set of difference parameters is determined that describes the difference between the modified past prototype waveform and the current prototype waveform;
transmitting the set of rotational parameters and the set of difference parameters instead of the current prototype waveform to a receiver; and
reconstructing the current prototype waveform from the received set of rotational parameters, the set of difference parameters, and a previously reconstructed past prototype waveform.
-
-
26. An apparatus for decoding a quasi-periodic speech signal that was transmitted from a transmission source to a receiver, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, the apparatus comprising:
-
a decoder for forming a reconstructed current prototype based on a first set of parameters, a second set of parameters, and a reconstructed previous prototype, wherein the first set of parameters describe how to modify a previous prototype such that said modified previous prototype approximates a current prototype, and the second set of parameters describe one or more codevectors from a first codebook, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype; and
a period interpolator for interpolating over the region between said reconstructed current prototype and said reconstructed previous prototype to form an interpolated residual signal and for synthesizing an output speech signal based on said interpolated residual signal.
-
-
27. An apparatus for coding a quasi-periodic speech signal, wherein the speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter, and wherein the residual signal is divided into frames of data, compring:
-
an extraction module for extracting a current prototype from a current frame of the residual signal and a previous protype from a previous frame;
a first circular LPC synthesis filter, coupled to receive said current prototype and to output a target signal;
a warping filter, coupled to receive said previous protoype, wherein said warping filter outputs a warped previous prototype having a length equal to the length of said current prototype;
a second circular LPC synthesis filter, coupled to receive said warped previous prototype, wherein said second circular LPC synthesis filter outputs a filtered warped previous prototype; and
a rotational correlator for calculating an optimum rotation and a first optimum gain, wherein said filtered warped previous prototype rotated by said optimum rotation and scaled by said first optimum gain best approximates said target signal; and
a multi-stage codebook for generating one or more codevectors, wherein said codevectors when summed approximate the difference between said current prototype and said modified previous prototype, and wherein said codevectors are described by a second set of parameters.
-
Specification