Multimode speech encoder and decoder apparatuses

US 6,334,105 B1
Filed: 04/18/2000
Issued: 12/25/2001
Est. Priority Date: 08/21/1998
Status: Expired due to Term

First Claim

Patent Images

1. A multimode speech coding apparatus comprising:

first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;

second coding means for coding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of modes;

dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means, said quantized LSP parameter being indicative of a spectral characteristic of a speech;

mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and

synthesis means for synthesizing an input speech signal incorporating and using a plurality of types of parameter information coded in said first coding means and said second coding means, wherein said second coding means comprises coding means for coding an excitation vector with a plurality of coding modes, said mode switching means switches said coding mode of said second coding means using said quantized LSP parameter indicative of a spectral characteristic of a speech, whereby information concerning said coding mode is not explicitly included in the synthesized input speech signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to a low bit rate speech coding apparatus which performs coding on a speech signal for transmission, for example, in a mobile communication system. Excitation information is coded in multimode using both static and dynamic characteristics of quantized vocal tract parameters. Decoding includes postprocessing in multimode, thereby improving the quality of both unvoiced speech regions and stationary noise regions of the transmitted speech signal.

44 Citations

View as Search Results

28 Claims

1. A multimode speech coding apparatus comprising:
- first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;
  
  second coding means for coding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of modes;
  
  dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means, said quantized LSP parameter being indicative of a spectral characteristic of a speech;
  
  mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and
  
  synthesis means for synthesizing an input speech signal incorporating and using a plurality of types of parameter information coded in said first coding means and said second coding means, wherein said second coding means comprises coding means for coding an excitation vector with a plurality of coding modes, said mode switching means switches said coding mode of said second coding means using said quantized LSP parameter indicative of a spectral characteristic of a speech, whereby information concerning said coding mode is not explicitly included in the synthesized input speech signal.
- View Dependent Claims (2, 3, 4)
- - 2. The multimode speech coding apparatus according to claim 1, wherein said mode switching means switches the coding mode of said second coding means using a static characteristic and a dynamic characteristic of the quantized LSP parameter.
  - 3. The multimode speech coding apparatus according to claim 1, wherein said mode switching means comprises means for judging stationarity of the quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter, and means for judging a voiced characteristic using the current quantized LSP parameter, and based on judged results, switches the coding mode of said second coding means.
  - 4. The multimode speech coding apparatus according to claim 1, wherein said dynamic characteristic extracting means comprises:

5. A multimode speech decoding apparatus comprising:
- first decoding means for decoding a quantized LSP parameter indicative of vocal tract information contained in a speech signal;
  
  second decoding means for decoding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of decoding modes;
  
  mode switching means for switching a decoding mode of said second decoding means based on a dynamic characteristic of the LSP parameter decoded in said first decoding means;
  
  synthesis means for decoding the speech signal using a plurality of types of parameter information decoded in said first decoding means and said second decoding means; and
  
  postprocessing means for performing postprocessing on the decoded speech signal based on the decoding mode, wherein said second decoding means comprises decoding means for decoding an excitation vector with a plurality of decoding modes, and said mode switching means switches the decoding mode of said second decoding means using the quantized LSP parameter indicative of a spectral characteristic of a speech included in the speech signal.
- View Dependent Claims (6, 7, 8, 9)
- - 6. The multimode speech decoding apparatus according to claim 5, wherein said mode switching means switches the decoding mode of said second decoding means using a static characteristic and a dynamic characteristic of the quantized LSP parameter indicative of the spectral characteristic of the speech.
  - 7. The multimode speech decoding apparatus according to claim 6, wherein said mode switching means comprises means for judging stationarity of the quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter, and means for judging a voiced characteristic using the current quantized LSP parameter, and based on judged results, switches the decoding mode of said second decoding means.
  - 8. The multimode speech decoding apparatus according to claim 7 wherein said apparatus switches postprocessing for a decoded signal based on said results.
  - 9. The multimode speech decoding apparatus according to claim 5, wherein said postprocessing means comprises:

10. A quantized-LSP-parameter dynamic characteristic extractor comprising:
- means for calculating an evolution of a quantized LSP parameter between frames;
  
  means for calculating an average quantized LSP parameter in a frame in which the quantized LSP parameter is stationary; and
  
  means for calculating an evolution between said average quantized LSP parameter and a current quantized LSP parameter.

11. A quantized-LSP-parameter static characteristic extractor comprising:
- means for calculating linear prediction residual power using a quantized LSP parameter; and
  
  means for calculating a region between neighboring orders of the quantized LSP parameter.

12. A multimode postprocessing apparatus comprising:
- judgment means for judging whether or not a region is a speech region using a decoded LSP parameter;
  
  FFT processing means for performing fast Fourier transform processing on a signal;
  
  spectral phase randomizing means for randomizing a spectral phase obtained by said fast Fourier transform processing corresponding to a result judged by said judgment means;
  
  spectral amplitude smoothing means for performing smoothing on a spectral amplitude obtained by said fast Fourier transform processing corresponding to said result; and
  
  IFFT processing means for performing inverse fast Fourier transform on the spectral phase randomized by said spectral phase randomizing means and the spectral amplitude smoothed by said spectral amplitude smoothing means.
- View Dependent Claims (13, 14)
- - 13. The multimode postprocessing apparatus according to claim 12, wherein said device determines a frequency of the spectral phase to be randomized using an average spectral amplitude of a previous unvoiced region in a speech region, and determines a frequency of the spectral phase to be randomized and the spectral amplitude to be smoothed using an average spectral amplitude with all frequencies in a perceptual weighted domain in an unvoiced region.
  - 14. The multimode postprocessing apparatus according to claim 12, wherein said device multiplexes in a speech region a noise generated using average spectral amplitude in a previous non-speech region.

15. A speech signal transmission apparatus having a speech input apparatus that converts a speech signal into an electric signal, an A/D converter that converts a signal output from the speech input apparatus into a digital signal, a multimode speech coding apparatus that codes the digital signal output from the A/D converter, an RF modulator that performs modulation processing on coded information output from the multimode speech coding apparatus, and a transmission antenna that converts a signal output from the RF modulator into radio signal to transmit, said multimode speech coding apparatus comprising:
- first coding means for coding an LSP parameter indicative of vocal tract information contained in a speech signal;
  
  second coding means for coding at least one type of parameter indicative of vocal tract information with a plurality of modes;
  
  dynamic characteristic extracting means for extracting a dynamic characteristic of a quantized LSP parameter coded in said first coding means;
  
  mode switching means for switching a coding mode of said second coding means based on said dynamic characteristic; and
  
  synthesis means for synthesizing an input speech signal using a plurality of types of parameter information coded in said first coding means and said second coding means.
- View Dependent Claims (16)
- - 16. The speech signal transmission apparatus according to claim 15, wherein said dynamic characteristic extracting means comprises:

17. A speech signal reception apparatus having a reception antenna that receives a radio signal, an RF demodulator that performs demodulation processing on a signal received at the reception antenna, a multimode decoding apparatus that decodes information obtained by the RF demodulator, a D/A converter that converts a digital speech signal decoded in the multimode decoding apparatus into an analog signal, and a speech output apparatus that converts an electric signal output from the D/A converter into a speech signal, said multimode decoding apparatus comprising:
- first decoding means for decoding a quantized LSP parameter indicative of vocal tract information contained in a speech signal;
  
  second decoding means for decoding at least one type of parameter indicative of vocal tract information contained in the speech signal with a plurality of decoding modes;
  
  mode switching means for switching a decoding mode of said second decoding means based on a dynamic characteristic of the LSP parameter decoded in said first decoding means;
  
  synthesis means for decoding the speech signal using a plurality of types of parameter information decoded in said first decoding means and said second decoding means; and
  
  postprocessing means for performing postprocessing on the decoded speech signal based on the decoding mode.

18. A computer readable recording medium with a computer executable program recorded therein, the program comprising the procedures of:
- extracting a dynamic characteristic of a quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter;
  
  judging a voiced characteristic using the dynamic characteristic of the current quantized LSP parameter; and
  
  switching a mode of a procedure for coding an excitation vector, based on the judged result.

19. A computer readable recording medium with a computer executable program recorded therein, the program comprising the procedures of:
- extracting a dynamic characteristic of a quantized LSP parameter using a previous quantized LSP parameter and a current quantized LSP parameter;
  
  judging a voiced characteristic using the current quantized LSP parameter;
  
  switching a mode of a procedure for decoding an excitation vector, based on the judged result; and
  
  switching a procedure of performing postprocessing on a decoded signal, based on the judged result.

20. A multimode speech coding method for performing mode switching of a mode for coding an excitation vector, using a static characteristic and a dynamic characteristic of a quantized parameter indicative of a spectral characteristic of a speech.

21. A multimode speech decoding method for performing mode switching of a mode for decoding an excitation vector, using a static characteristic and a dynamic characteristic of a quantized parameter indicative of a spectral characteristic of a speech.
- View Dependent Claims (22)
- - 22. The multimode speech decoding method according to claim 21, said method comprising the steps of:
    - performing postprocessing on a decoded signal; and
      
      switching the step of performing postprocessing, based on mode information.

23. A quantized-LSP-parameter dynamic characteristic extracting method comprising the steps of:
- calculating an evolution of a quantized LSP parameter between frames;
  
  calculating an average quantized LSP parameter in a frame in which the quantized LSP parameter is stationary; and
  
  calculating an evolution between said average quantized LSP parameter and a current quantized LSP parameters.
- View Dependent Claims (24)
- - 24. The speech signal reception apparatus according to claim 23, wherein said postprocessing means comprises:

25. A quantized-LSP-parameter static characteristic extracting method comprising the steps:
- calculating linear prediction residual power using a quantized LSP parameter; and
  
  calculating a region between neighboring orders of the quantized LSP parameter.

26. A multimode postprocessing method comprising:
- the judgment step of judging whether or not a region is a speech region using a decoded LSP parameter;
  
  the FFT processing step of performing fast Fourier transform processing on a signal;
  
  the spectral phase randomizing step of randomizing a spectral phase obtained by said fast Fourier transform processing corresponding to a result determined by said judgment step;
  
  the spectral amplitude smoothing step of performing smoothing on a spectral amplitude obtained by said fast Fourier transform processing corresponding to said result; and
  
  the IFFT processing step of performing inverse fast Fourier transform on the spectral phase randomized by said spectral phase randomizing step and the spectral amplitude smoothed by said spectral amplitude smoothing step.

27. A multimode speech coding apparatus comprising:
- first coding means for coding vocal tract information contained in a speech signal; and
  
  second coding means for coding excitation information contained in the speech signal, said second coding means having a plurality of coding modes;
  
  wherein each of said plurality of coding modes is determined using a variation in the information coded in said first coding means, each of said plurality of coding modes comprises a non-speech interval mode and a speech interval mode, each said speech interval mode comprises a voiced interval mode and an unvoiced interval mode and coding is performed separately to a voiced region and an unvoiced region separated from the speech interval.
- View Dependent Claims (28)
- - 28. The multimode speech coding apparatus according to claim 27, wherein said first coding means codes a spectral characteristic parameter of the speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Ehara, Hiroyuki
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Nolan, Daniel A.

Application Number

US09/529,660
Time in Patent Office

616 Days
Field of Search

704/201, 704/203, 704/204, 704/258, 704/269, 704/500, 704/205, 704/229, 704/221, 704/230
US Class Current

704/258
CPC Class Codes

G10L 19/18 Vocoders using multiple modes

Multimode speech encoder and decoder apparatuses

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

44 Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Multimode speech encoder and decoder apparatuses

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links