Multimode speech encoder and decoder apparatuses
First Claim
Patent Images
1. A multimode speech coding apparatus comprising:
- first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;
second coding means for coding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of modes;
dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means, said quantized LSP parameter being indicative of a spectral characteristic of a speech;
mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and
synthesis means for synthesizing an input speech signal incorporating and using a plurality of types of parameter information coded in said first coding means and said second coding means, wherein said second coding means comprises coding means for coding an excitation vector with a plurality of coding modes, said mode switching means switches said coding mode of said second coding means using said quantized LSP parameter indicative of a spectral characteristic of a speech, whereby information concerning said coding mode is not explicitly included in the synthesized input speech signal.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a low bit rate speech coding apparatus which performs coding on a speech signal for transmission, for example, in a mobile communication system. Excitation information is coded in multimode using both static and dynamic characteristics of quantized vocal tract parameters. Decoding includes postprocessing in multimode, thereby improving the quality of both unvoiced speech regions and stationary noise regions of the transmitted speech signal.
44 Citations
28 Claims
-
1. A multimode speech coding apparatus comprising:
-
first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;
second coding means for coding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of modes;
dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means, said quantized LSP parameter being indicative of a spectral characteristic of a speech;
mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and
synthesis means for synthesizing an input speech signal incorporating and using a plurality of types of parameter information coded in said first coding means and said second coding means, wherein said second coding means comprises coding means for coding an excitation vector with a plurality of coding modes, said mode switching means switches said coding mode of said second coding means using said quantized LSP parameter indicative of a spectral characteristic of a speech, whereby information concerning said coding mode is not explicitly included in the synthesized input speech signal. - View Dependent Claims (2, 3, 4)
means for calculating a difference between frames of said quantized LSP parameter;
means for calculating an average quantized LSP parameter in a frame in which said quantized LSP parameter is stationary; and
means for calculating a distance between said average quantized LSP parameter and a current quantized LSP parameter.
-
-
5. A multimode speech decoding apparatus comprising:
-
first decoding means for decoding a quantized LSP parameter indicative of vocal tract information contained in a speech signal;
second decoding means for decoding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of decoding modes;
mode switching means for switching a decoding mode of said second decoding means based on a dynamic characteristic of the LSP parameter decoded in said first decoding means;
synthesis means for decoding the speech signal using a plurality of types of parameter information decoded in said first decoding means and said second decoding means; and
postprocessing means for performing postprocessing on the decoded speech signal based on the decoding mode, wherein said second decoding means comprises decoding means for decoding an excitation vector with a plurality of decoding modes, and said mode switching means switches the decoding mode of said second decoding means using the quantized LSP parameter indicative of a spectral characteristic of a speech included in the speech signal. - View Dependent Claims (6, 7, 8, 9)
judging means for judging whether or not a region is a speech interval using the decoded LSP parameter;
FFT processing means for performing Fast Fourier Transform processing on a signal;
spectral phase randomizing means for randomizing a spectral phase obtained by said Fast Fourier Transform processing corresponding to a judged result by said judging means;
spectral amplitude smoothing means for smoothing a spectral amplitude obtained by said Fast Fourier Transform processing corresponding to the judged result; and
IFFT processing means for performing Inverse Fast Fourier Transform processing on the spectral phase randomized by said spectral phase randomizing means and the spectral amplitude smoothed by said spectral amplitude smoothing means.
-
-
10. A quantized-LSP-parameter dynamic characteristic extractor comprising:
-
means for calculating an evolution of a quantized LSP parameter between frames;
means for calculating an average quantized LSP parameter in a frame in which the quantized LSP parameter is stationary; and
means for calculating an evolution between said average quantized LSP parameter and a current quantized LSP parameter.
-
-
11. A quantized-LSP-parameter static characteristic extractor comprising:
-
means for calculating linear prediction residual power using a quantized LSP parameter; and
means for calculating a region between neighboring orders of the quantized LSP parameter.
-
-
12. A multimode postprocessing apparatus comprising:
-
judgment means for judging whether or not a region is a speech region using a decoded LSP parameter;
FFT processing means for performing fast Fourier transform processing on a signal;
spectral phase randomizing means for randomizing a spectral phase obtained by said fast Fourier transform processing corresponding to a result judged by said judgment means;
spectral amplitude smoothing means for performing smoothing on a spectral amplitude obtained by said fast Fourier transform processing corresponding to said result; and
IFFT processing means for performing inverse fast Fourier transform on the spectral phase randomized by said spectral phase randomizing means and the spectral amplitude smoothed by said spectral amplitude smoothing means. - View Dependent Claims (13, 14)
-
-
15. A speech signal transmission apparatus having a speech input apparatus that converts a speech signal into an electric signal, an A/D converter that converts a signal output from the speech input apparatus into a digital signal, a multimode speech coding apparatus that codes the digital signal output from the A/D converter, an RF modulator that performs modulation processing on coded information output from the multimode speech coding apparatus, and a transmission antenna that converts a signal output from the RF modulator into radio signal to transmit, said multimode speech coding apparatus comprising:
-
first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;
second coding means for coding at least one type of parameter indicative of vocal tract information with a plurality of modes;
dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means;
mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and
synthesis means for synthesizing an input speech signal using a plurality of types of parameter information coded in said first coding means and said second coding means. - View Dependent Claims (16)
means for calculating a difference between frames of the quantized LSP parameter;
means for calculating an average quantized LSP parameter in a frame in which the quantized LSP parameter is stationary; and
means for calculating a distance between the average quantized LSP parameter and a current quantized LSP parameter.
-
-
17. A speech signal reception apparatus having a reception antenna that receives a radio signal, an RF demodulator that performs demodulation processing on a signal received at the reception antenna, a multimode decoding apparatus that decodes information obtained by the RF demodulator, a D/A converter that converts a digital speech signal decoded in the multimode decoding apparatus into an analog signal, and a speech output apparatus that converts an electric signal output from the D/A converter into a speech signal, said multimode decoding apparatus comprising:
-
first decoding means for decoding a quantized LSP parameter indicative of vocal tract information contained in a speech signal;
second decoding means for decoding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of decoding modes;
mode switching means for switching a decoding mode of said second decoding means based on a dynamic characteristic of the LSP parameter decoded in said first decoding means;
synthesis means for decoding the speech signal using a plurality of types of parameter information decoded in said first decoding means and said second decoding means; and
postprocessing means for performing postprocessing on the decoded speech signal based on the decoding mode.
-
-
18. A computer readable recording medium with a computer executable program recorded therein, the program comprising the procedures of:
-
extracting a dynamic characteristic of a quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter;
judging a voiced characteristic using the dynamic characteristic of the current quantized LSP parameter; and
switching a mode of a procedure for coding an excitation vector, based on the judged result.
-
-
19. A computer readable recording medium with a computer executable program recorded therein, the program comprising the procedures of:
-
extracting a dynamic characteristic of a quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter;
judging a voiced characteristic using the current quantized LSP parameter;
switching a mode of a procedure for decoding an excitation vector, based on the judged result; and
switching a procedure of performing postprocessing on a decoded signal, based on the judged result.
-
-
20. A multimode speech coding method for performing mode switching of a mode for coding an excitation vector, using a static characteristic and a dynamic characteristic of a quantized parameter indicative of a spectral characteristic of a speech.
- 21. A multimode speech decoding method for performing mode switching of a mode for decoding an excitation vector, using a static characteristic and a dynamic characteristic of a quantized parameter indicative of a spectral characteristic of a speech.
-
23. A quantized-LSP-parameter dynamic characteristic extracting method comprising the steps of:
-
calculating an evolution of a quantized LSP parameter between frames;
calculating an average quantized LSP parameter in a frame in which the quantized LSP parameter is stationary; and
calculating an evolution between said average quantized LSP parameter and a current quantized LSP parameters. - View Dependent Claims (24)
judging means for judging whether or not a region is a speech interval using the decoded LSP parameter;
FFT processing means for performing Fast Fourier Transform processing on a signal;
spectral phase randomizing means for randomizing a spectral phase obtained by said Fast Fourier Transform processing corresponding to a judged result by said judging means;
spectral amplitude smoothing means for smoothing a spectral amplitude obtained by said Fast Fourier Transform processing corresponding to the judged result; and
IFFT processing means for performing Inverse Fast Fourier Transform processing on the spectral phase randomized by said spectral phase randomizing means and the spectral amplitude smoothed by said spectral amplitude smoothing means.
-
-
25. A quantized-LSP-parameter static characteristic extracting method comprising the steps:
-
calculating linear prediction residual power using a quantized LSP parameter; and
calculating a region between neighboring orders of the quantized LSP parameter.
-
-
26. A multimode postprocessing method comprising:
-
the judgment step of judging whether or not a region is a speech region using a decoded LSP parameter;
the FFT processing step of performing fast Fourier transform processing on a signal;
the spectral phase randomizing step of randomizing a spectral phase obtained by said fast Fourier transform processing corresponding to a result determined by said judgment step;
the spectral amplitude smoothing step of performing smoothing on a spectral amplitude obtained by said fast Fourier transform processing corresponding to said result; and
the IFFT processing step of performing inverse fast Fourier transform on the spectral phase randomized by said spectral phase randomizing step and the spectral amplitude smoothed by said spectral amplitude smoothing step.
-
-
27. A multimode speech coding apparatus comprising:
-
first coding means for coding vocal tract information contained in a speech signal; and
second coding means for coding excitation information contained in the speech signal, said second coding means having a plurality of coding modes;
wherein each of said plurality of coding modes is determined using a variation in the information coded in said first coding means, each of said plurality of coding modes comprises a non-speech interval mode and a speech interval mode, each said speech interval mode comprises a voiced interval mode and an unvoiced interval mode and coding is performed separately to a voiced region and an unvoiced region separated from the speech interval. - View Dependent Claims (28)
-
Specification