Speech coding/decoding method and apparatus
First Claim
1. A speech coding method, comprising:
- analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
generating a synthesized speech signal based on the coded result and the excitation signal;
generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
selecting a pulse position candidate from a pulse position codebook in accordance with the second index; and
outputting the first and second indexes.
1 Assignment
0 Petitions
Accused Products
Abstract
An input speech signal to an input terminal is supplied to a speech synthesizer section through a speech analyzer section and frequency parameter quantizer section to form a synthesis filter, and the input speech signal is expressed by quantized LPC coefficients representing the characteristics of the synthesis filter and an excitation signal for exciting the synthesis filter. In this case, in a pulse excitation section, a pulse position selector selects pulse position candidates from the integer pulse positions and non-integer pulse positions stored in a pulse position codebook, and an integer position pulse generator and non-integer position pulse generator respectively generate integer position pulses set at sampling points of the excitation signal and non-integer position pulses set at positions located between sampling points. These pulses are synthesized into a pulse train serving as a source of an excitation signal.
30 Citations
20 Claims
-
1. A speech coding method, comprising:
-
analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
generating a synthesized speech signal based on the coded result and the excitation signal;
generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
selecting a pulse position candidate from a pulse position codebook in accordance with the second index; and
outputting the first and second indexes. - View Dependent Claims (2, 3)
storing the first positions and the second positions together in said pulse position codebook.
-
-
3. The method according to claim 1, wherein the analyzing step comprises generating the excitation signal in units of frames.
-
4. A speech coding method, comprising:
-
analyzing an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
generating a synthesized speech signal based on the excitation signal and the coded result;
selecting, from an adaptive codebook, a pitch vector with which a power of an error between the synthesized speech signal and the input speech signal is minimized;
adding the pulse train to the pitch vector to generate the excitation signal; and
outputting the first index and a second index indicating the selected pitch vector. - View Dependent Claims (5)
making the pulse train periodic in units of pitches.
-
-
6. A speech coding method which comprises:
-
analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result, the excitation signal being formed of a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between sampling points of the excitation signal;
generating an excitation signal for exciting a synthesis filter by using a pitch vector and a stochastic vector;
generating the stochastic vector by using a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the stochastic vector and the second pulses being set between sampling points of the stochastic vector;
generating a synthesized speech signal based on the coded result and the excitation signal; and
generating a second index with which an error between the input speech signal and the synthesized speech signal is minimized.
-
-
7. A speech coding method which comprises:
-
analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
generating an excitation signal for exciting a synthesis filter by using a pitch vector and a stochastic vector;
selecting a predetermined number of pulse positions from pulse position candidates to be adapted on the basis of a shape of the pitch vector, the pulse position candidates including first pulse position candidates whose pulse positions are located on sampling points of the stochastic vector and second pulse position candidates whose positions are located between sampling points of the stochastic vector;
arranging pulses at the predetermined number of pulse positions to generate a pulse train to be used for generating the stochastic vector;
generating a synthesized speech signal based the coded result and the excitation signal;
generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
selecting the pulse position candidates from a pulse position codebook in accordance with the second index; and
outputting the first and second indexes.
-
-
8. A speech decoding method, comprising:
-
extracting, from a coded stream, a first index indicating a frequency characteristic of a speech, a second index indicating a pulse train of an excitation signal;
reconstructing a synthesis filter by decoding the first index;
reconstructing the excitation signal based on the second index, the pulse train, including pulses selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal, and the second pulses being set at positions located between the sampling points of the excitation signal; and
generating a decoded speech signal by exciting the synthesis filter using the reconstructed excitation signal.
-
-
9. A speech decoding method which comprises:
-
extracting, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating a pulse train of an excitation signal including a pitch vector and a stochastic vector;
reconstructing a synthesis filter by decoding the first index;
reconstructing the excitation signal based on the second index, the stochastic vector including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between sampling points of the excitation signal; and
generating a decoded speech signal by exciting the synthesis filter on the basis of the reconstructed excitation signal.
-
-
10. A speech decoding method which comprises:
-
extracting, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal;
reconstructing a synthesis filter by decoding the first index;
reconstructing the excitation signal based on the second index, the excitation signal being constituted by a stochastic vector and a pitch vector, the stochastic vector including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates to be adapted on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates and second pulse position candidates, the first pulse position candidates being set on sampling points of the stochastic vector and the second pulse position candidates being set at positions located between sampling points of the stochastic vector; and
decoding a speech signal by exciting a synthesis filter by means of the excitation signal.
-
-
11. A speech coding apparatus, comprising:
-
a speech analyzer section configured to analyze an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result;
a pulse excitation section configured to generate a pulse train, as the excitation signal, the pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal, and the second pulses being set at second positions located between the sampling points of the excitation signal;
a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal;
a first index output section configured to generate a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
a pulse position codebook configured to store pulse position candidates;
a selector section configured to select a pulse position candidate from said pulse position codebook in accordance with the second index; and
an output section configured to output the first and second indexes. - View Dependent Claims (12, 13)
-
-
14. A speech coding apparatus, comprising:
-
a speech analyzer section configured to analyze an input speech signal (1) to divide the input speech signal into a parameter representing a frequency characteristic of speech and an excitation signal, the excitation signal being an input signal to a synthesis filter, the synthesis filter generated based on the parameter, and (2) to output a first index specifying the parameter as a coded result;
a pulse excitation section configured to generate a pulse train, as the excitation signal, the pulse train including pulses selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between the sampling points of the excitation signal;
a speech synthesizer section configured to generate a synthesized speech signal based on the excitation signal and the coded result;
an adaptive codebook configured to store a plurality of pitch vectors;
a selector section configured to select a pitch vector, from an adaptive codebook, with which a power of an error between the synthesized speech signal and the input speech signal is minimized;
an excitation signal generator section configured to add the pulse train to the pitch vector for generating the excitation signal; and
an index output section configured to output the first index and a second index indicating the selected pitch vector. - View Dependent Claims (15)
a pitch filter configured to make the pulse train periodic in units of pitches.
-
-
16. A speech coding apparatus comprising:
-
a speech analyzer section configured to analyze an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
an excitation signal generator section configured to generate the excitation signal including a pitch vector and a stochastic vector, the stochastic vector including a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between sampling points of the stochastic vector;
a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal; and
an index generator section configured to generate a second index with which an error between the input speech signal and the synthesized speech signal is minimized.
-
-
17. A speech coding apparatus comprising:
-
a speech analyzer section configured to analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter as a coded result;
an excitation signal generator section configured to generate an excitation signal constituted by a pitch vector and a stochastic vector, the stochastic vector being formed by a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates to be adapted on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates and second pulse position candidates, the first pulse position candidates being set on sampling points of the stochastic vector and the second pulse position candidates being set at positions located between the sampling points of the stochastic vector;
a speech synthesizer section configured to generate a synthesized speech signal based on the coded result and the excitation signal;
an index generator section configured to generate a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized;
a pulse position codebook configured to store a plurality of pulse position candidates;
a selector section configured to select the pulse position candidate from said pulse position codebook in accordance with the second index.
-
-
18. A speech decoding apparatus, comprising:
-
a demultiplexer section configured to extract, from a coded stream, a first index indicating a frequency characteristic of speech, and a second index indicating a pulse train of an excitation signal;
a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
an excitation signal reconstructing section configured to reconstruct the excitation signal, including a pulse train that include pulses selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between the sampling points of the excitation signal based on the second index; and
a decoding section configured to generate a decoded speech signal by exciting a synthesis filter using the reconstructed excitation signal.
-
-
19. A speech decoding apparatus comprising:
-
a demultiplexer section configured to extract, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal including a pitch vector and a stochastic vector;
a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
an excitation signal reconstructing section configured to reconstruct the excitation signal based the second index, the excitation signal including a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between sampling points of the excitation signal; and
a decoding section configured to generate a decoded speech signal by exciting the synthesis filter by means of the reconstructed excitation signal.
-
-
20. A speech decoding apparatus comprising:
-
a demultiplexer section configured to extract, from a coded stream, a first index indicting a frequency characteristic of a speech and a second index indicating an excitation signal;
a reconstruction section configured to reconstruct a synthesis filter by decoding the first index;
an excitation signal reconstructing section configured to reconstruct the excitation signal based on the second index, the excitation signal including a pitch vector and a stochastic vector formed of a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates subjected to adapting on the basis of a shape of the pitch vector, and the pulse position candidates including first pulse position candidates set on sampling points of the stochastic vector and second pulse position candidates set at positions located between the sampling points of the stochastic vector; and
a decoding section configured to decode a speech signal by exciting a synthesis filter using the excitation signal.
-
Specification