Waveform interpolation speech coding using splines
First Claim
1. A method of synthesizing a reconstructed speech signal based on encoded signals communicated via a communications channel, the method comprising the steps of:
- receiving at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period;
generating at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations;
synthesizing the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters.
3 Assignments
0 Petitions
Accused Products
Abstract
A low-complexity method and apparatus for performing waveform interpolation in a low bit-rate WI speech decoder, wherein interpolation between received waveforms is performed with use of spline coefficients generated based thereupon. Specifically, two signals are received from a WI encoder, each comprising a set of frequency domain parameters representing a speech signal segment of a corresponding pitch period. Then, spline coefficients are generated from each of the received signals, wherein each set of spline coefficients comprises a spline representation of a time domain transformation of the corresponding set of frequency domain parameters. Finally, the decoder interpolates between the spline representations to generate interpolated time domain data which is used to synthesize a reconstructed speech signal. In certain embodiments of the present invention, the time scale of at least one of the spline representations is modified to enable the interpolation therebetween. Also, in accordance with one illustrative embodiment of the present invention, a cubic spline representation is used, while in accordance with another illustrative embodiment, a novel variant of a cardinal spline representation is advantageously employed.
64 Citations
20 Claims
-
1. A method of synthesizing a reconstructed speech signal based on encoded signals communicated via a communications channel, the method comprising the steps of:
-
receiving at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period; generating at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations; synthesizing the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech decoder which synthesizes a reconstructed speech signal based on encoded signals communicated via a communications channel, the decoder comprising:
-
a signal receiver which receives at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period; a spline coefficient generator which generates at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations; a signal synthesizer which synthesizes the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification