Waveform interpolation speech coding using splines

US 5,903,866 A
Filed: 03/10/1997
Issued: 05/11/1999
Est. Priority Date: 03/10/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of synthesizing a reconstructed speech signal based on encoded signals communicated via a communications channel, the method comprising the steps of:

receiving at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period;

generating at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations;

synthesizing the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A low-complexity method and apparatus for performing waveform interpolation in a low bit-rate WI speech decoder, wherein interpolation between received waveforms is performed with use of spline coefficients generated based thereupon. Specifically, two signals are received from a WI encoder, each comprising a set of frequency domain parameters representing a speech signal segment of a corresponding pitch period. Then, spline coefficients are generated from each of the received signals, wherein each set of spline coefficients comprises a spline representation of a time domain transformation of the corresponding set of frequency domain parameters. Finally, the decoder interpolates between the spline representations to generate interpolated time domain data which is used to synthesize a reconstructed speech signal. In certain embodiments of the present invention, the time scale of at least one of the spline representations is modified to enable the interpolation therebetween. Also, in accordance with one illustrative embodiment of the present invention, a cubic spline representation is used, while in accordance with another illustrative embodiment, a novel variant of a cardinal spline representation is advantageously employed.

64 Citations

20 Claims

1. A method of synthesizing a reconstructed speech signal based on encoded signals communicated via a communications channel, the method comprising the steps of:
- receiving at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period;
  
  generating at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations;
  
  synthesizing the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein the spline representations have a finite support basis function.
  - 3. The method of claim 2 wherein the spline representations comprise samples of the time domain transformation corresponding thereto.
  - 4. The method of claim 1 wherein the first pitch period and the second pitch period are unequal and wherein the step of synthesizing the reconstructed signal comprises the step of modifying the time scale of at least the spline representation of the time domain transformation of the second set of frequency domain parameters.
  - 5. The method of claim 1 further comprising the step of performing an inverse transform on the first and second sets of frequency domain parameters to produce corresponding first and second sets of time domain parameters, and wherein the generating step is based on said first and second sets of time domain parameters.
  - 6. The method of claim 5 further comprising the step of zero-padding the first and second sets of frequency domain parameters to a fixed radix-2 size prior to the step of performing said inverse transform.
  - 7. The method of claim 6 wherein said inverse transform comprises an IFFT.
  - 8. The method of claim 1 wherein the step of synthesizing the reconstructed signal comprises the steps of:
    - generating a set of interpolated spline coefficients which comprises a spline representation of a continuous time domain signal; and
      
      generating the reconstructed signal based on the set of interpolated spline coefficients.
  - 9. The method of claim 8 wherein the reconstructed signal is generated by sampling the continuous time domain signal at a non-uniform rate.
  - 10. The method of claim 9 wherein the non-uniform rate is determined based on the first and second pitch periods.

11. A speech decoder which synthesizes a reconstructed speech signal based on encoded signals communicated via a communications channel, the decoder comprising:
- a signal receiver which receives at least two communicated signals, including a first communicated signal comprising a first set of frequency domain parameters representing a first speech signal segment of a length equal to a first pitch-period and a second communicated signal comprising a second set of frequency domain parameters representing a second speech signal segment of a length equal to a second pitch-period;
  
  a spline coefficient generator which generates at least two sets of spline coefficients, including a first set of spline coefficients which comprises a spline representation of a time domain transformation of the first set of frequency domain parameters and a second set of spline coefficients which comprises a spline representation of a time domain transformation of the second set of frequency domain parameters, wherein the spline representations are based on cardinal spline representations;
  
  a signal synthesizer which synthesizes the reconstructed signal by interpolating between the spline representation of the time domain transformation of the first set of frequency domain parameters and the spline representation of the time domain transformation of the second set of frequency domain parameters.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The decoder of claim 11 wherein the spline representations have a finite support basis function.
  - 13. The decoder of claim 12 wherein the spline representations comprise samples of the time domain transformation corresponding thereto.
  - 14. The decoder of claim 11 wherein the first pitch period and the second pitch period are unequal and wherein the signal synthesizer comprises means for modifying the time scale of at least the spline representation of the time domain transformation of the second set of frequency domain parameters.
  - 15. The decoder of claim 11 further comprising an inverse transform performed on the first and second sets of frequency domain parameters to produce corresponding first and second sets of time domain parameters, and wherein the spline coefficient generator is based on said first and second sets of time domain parameters.
  - 16. The decoder of claim 15 further comprising means for zero-padding the first and second sets of frequency domain parameters to a fixed radix-2 size for use by said inverse transform.
  - 17. The decoder of claim 16 wherein said inverse transform comprises an IFFT.
  - 18. The decoder of claim 11 wherein the signal synthesizer comprises:
    - means for generating a set of interpolated spline coefficients which comprises a spline representation of a continuous time domain signal; and
      
      means for generating the reconstructed signal based on the set of interpolated spline coefficients.
  - 19. The decoder of claim 18 wherein the reconstructed signal is generated by sampling the continuous time domain signal at a non-uniform rate.
  - 20. The decoder of claim 19 wherein the non-uniform rate is determined based on the first and second pitch periods.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Shoham, Yair
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Sax, Robert Louis

Application Number

US08/814,075
Time in Patent Office

792 Days
Field of Search

794/265
US Class Current

704/265
CPC Class Codes

G10L 19/097 using prototype waveform de...

Waveform interpolation speech coding using splines

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

64 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Waveform interpolation speech coding using splines

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

64 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links