Speech coding/decoding method and apparatus

US 6,611,797 B1
Filed: 01/21/2000
Issued: 08/26/2003
Est. Priority Date: 01/22/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech coding method, comprising:

analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;

generating a synthesized speech signal based on the coded result and the excitation signal;

generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;

selecting a pulse position candidate from a pulse position codebook in accordance with the second index; and

outputting the first and second indexes.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An input speech signal to an input terminal is supplied to a speech synthesizer section through a speech analyzer section and frequency parameter quantizer section to form a synthesis filter, and the input speech signal is expressed by quantized LPC coefficients representing the characteristics of the synthesis filter and an excitation signal for exciting the synthesis filter. In this case, in a pulse excitation section, a pulse position selector selects pulse position candidates from the integer pulse positions and non-integer pulse positions stored in a pulse position codebook, and an integer position pulse generator and non-integer position pulse generator respectively generate integer position pulses set at sampling points of the excitation signal and non-integer position pulses set at positions located between sampling points. These pulses are synthesized into a pulse train serving as a source of an excitation signal.

30 Citations

View as Search Results

20 Claims

1. A speech coding method, comprising:
- analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
  
  generating a synthesized speech signal based on the coded result and the excitation signal;
  
  generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
  
  selecting a pulse position candidate from a pulse position codebook in accordance with the second index; and
  
  outputting the first and second indexes.
- View Dependent Claims (2, 3)
- - 2. The method according to claim 1, further comprising:
3. The method according to claim 1, wherein the analyzing step comprises generating the excitation signal in units of frames.

4. A speech coding method, comprising:
- analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
  
  generating a synthesized speech signal based on the excitation signal and the coded result;
  
  selecting, from an adaptive codebook, a pitch vector with which a power of an error between the synthesized speech signal and the input speech signal is minimized;
  
  adding the pulse train to the pitch vector to generate the excitation signal; and
  
  outputting the first index and a second index indicating the selected pitch vector.
- View Dependent Claims (5)
- - 5. The method according to claim 4, further comprising:

6. A speech coding method which comprises:
- analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between sampling points of the excitation signal;
  
  generating an excitation signal for exciting a synthesis filter by using a pitch vector and a stochastic vector;
  
  generating the stochastic vector by using a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the stochastic vector and the second pulses being set between sampling points of the stochastic vector;
  
  generating a synthesized speech signal based on the coded result and the excitation signal; and
  
  generating a second index with which an error between the input speech signal and the synthesized speech signal is minimized.

7. A speech coding method which comprises:
- analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
  
  generating an excitation signal for exciting a synthesis filter by using a pitch vector and a stochastic vector;
  
  selecting a predetermined number of pulse positions from pulse position candidates to be adapted on the basis of a shape of the pitch vector, the pulse position candidates including first pulse position candidates whose pulse positions are located on sampling points of the stochastic vector and second pulse position candidates whose positions are located between sampling points of the stochastic vector;
  
  arranging pulses at the predetermined number of pulse positions to generate a pulse train to be used for generating the stochastic vector;
  
  generating a synthesized speech signal based the coded result and the excitation signal;
  
  generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
  
  selecting the pulse position candidates from a pulse position codebook in accordance with the second index; and
  
  outputting the first and second indexes.

8. A speech decoding method, comprising:
- extracting, from a coded stream, a first index indicating a frequency characteristic of a speech, a second index indicating a pulse train of an excitation signal;
  
  reconstructing a synthesis filter by decoding the first index;
  
  reconstructing the excitation signal based on the second index, the pulse train, including pulses selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal, and the second pulses being set at positions located between the sampling points of the excitation signal; and
  
  generating a decoded speech signal by exciting the synthesis filter using the reconstructed excitation signal.

9. A speech decoding method which comprises:
- extracting, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating a pulse train of an excitation signal including a pitch vector and a stochastic vector;
  
  reconstructing a synthesis filter by decoding the first index;
  
  reconstructing the excitation signal based on the second index, the stochastic vector including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between sampling points of the excitation signal; and
  
  generating a decoded speech signal by exciting the synthesis filter on the basis of the reconstructed excitation signal.

10. A speech decoding method which comprises:
- extracting, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal;
  
  reconstructing a synthesis filter by decoding the first index;
  
  reconstructing the excitation signal based on the second index, the excitation signal being constituted by a stochastic vector and a pitch vector, the stochastic vector including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates to be adapted on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates and second pulse position candidates, the first pulse position candidates being set on sampling points of the stochastic vector and the second pulse position candidates being set at positions located between sampling points of the stochastic vector; and
  
  decoding a speech signal by exciting a synthesis filter by means of the excitation signal.

11. A speech coding apparatus, comprising:
- a speech analyzer section configured to analyze an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result;
  
  a pulse excitation section configured to generate a pulse train, as the excitation signal, the pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
  
  a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal;
  
  a first index output section configured to generate a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
  
  a pulse position codebook configured to store pulse position candidates;
  
  a selector section configured to select a pulse position candidate from said pulse position codebook in accordance with the second index; and
  
  an output section configured to output the first and second indexes.
- View Dependent Claims (12, 13)
- - 12. An apparatus according to claim 11, wherein said pulse position codebook stores the first and second positions together.
  - 13. An apparatus according to claim 11, wherein said pulse excitation section generates the excitation signal in units of frames.

14. A speech coding apparatus, comprising:
- a speech analyzer section configured to analyze an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result;
  
  a pulse excitation section configured to generate a pulse train, as the excitation signal, the pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between the sampling points of the excitation signal;
  
  a speech synthesizer section configured to generate a synthesized speech signal based on the excitation signal and the coded result;
  
  an adaptive codebook configured to store a plurality of pitch vectors;
  
  a selector section configured to select a pitch vector, from an adaptive codebook, with which a power of an error between the synthesized speech signal and the input speech signal is minimized;
  
  an excitation signal generator section configured to add the pulse train to the pitch vector for generating the excitation signal; and
  
  an index output section configured to output the first index and a second index indicating the selected pitch vector.
- View Dependent Claims (15)
- - 15. The apparatus according to claim 14, further comprising:

16. A speech coding apparatus comprising:
- a speech analyzer section configured to analyze an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
  
  an excitation signal generator section configured to generate the excitation signal including a pitch vector and a stochastic vector, the stochastic vector including a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between sampling points of the stochastic vector;
  
  a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal; and
  
  an index generator section configured to generate a second index with which an error between the input speech signal and the synthesized speech signal is minimized.

17. A speech coding apparatus comprising:
- a speech analyzer section configured to analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
  
  an excitation signal generator section configured to generate an excitation signal constituted by a pitch vector and a stochastic vector, the stochastic vector being formed by a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates to be adapted on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates and second pulse position candidates, the first pulse position candidates being set on sampling points of the stochastic vector and the second pulse position candidates being set at positions located between the sampling points of the stochastic vector;
  
  a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal;
  
  an index generator section configured to generate a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
  
  a pulse position codebook configured to store a plurality of pulse position candidates;
  
  a selector section configured to select the pulse position candidate from said pulse position codebook in accordance with the second index.

18. A speech decoding apparatus, comprising:
- a demultiplexer section configured to extract, from a coded stream, a first index indicating a frequency characteristic of speech, and a second index indicating a pulse train of an excitation signal;
  
  a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
  
  an excitation signal reconstructing section configured to reconstruct the excitation signal, including a pulse train that include pulses selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between the sampling points of the excitation signal based on the second index; and
  
  a decoding section configured to generate a decoded speech signal by exciting a synthesis filter using the reconstructed excitation signal.

19. A speech decoding apparatus comprising:
- a demultiplexer section configured to extract, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal including a pitch vector and a stochastic vector;
  
  a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
  
  an excitation signal reconstructing section configured to reconstruct the excitation signal based the second index, the excitation signal including a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between sampling points of the excitation signal; and
  
  a decoding section configured to generate a decoded speech signal by exciting the synthesis filter by means of the reconstructed excitation signal.

20. A speech decoding apparatus comprising:
- a demultiplexer section configured to extract, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal;
  
  a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
  
  an excitation signal reconstructing section configured to reconstruct the excitation signal based on the second index, the excitation signal including a pitch vector and a stochastic vector formed of a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates subjected to adapting on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates set on sampling points of the stochastic vector and second pulse position candidates set at positions located between the sampling points of the stochastic vector; and
  
  a decoding section configured to decode a speech signal by exciting a synthesis filter using the excitation signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Amada, Tadashi, Tsuchiya, Katsumi
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/488,748
Time in Patent Office

1,313 Days
Field of Search

704/219, 704/220, 704/221, 704/222, 704/223, 704/211, 704/212
US Class Current

704/211
CPC Class Codes

G10L 19/12 the excitation function bei...

Speech coding/decoding method and apparatus

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

30 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Speech coding/decoding method and apparatus

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others