Phonemic unit dictionary based on shifted portions of source codebook vectors, for text-to-speech synthesis
First Claim
1. Speech synthesis method for synthesizing a speech signal by filtering a speech source signal through a synthesis filter, comprising the steps of:
- storing a plurality of speech source signals as a code vector in a speech source signal codebook;
storing a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector and a shift number for the code vector to decode the speech source signal in a unit dictionary memory;
selecting a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory;
selecting the code vector corresponding to the speech source signal index in the synthesis unit from said speech source signal codebook; and
shifting the code vector according to the shift number in the synthesis unit.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech synthesis apparatus synthesize a speech signal by filtering a speech source signal through a synthesis filter. A speech source signal codebook stores a plurality of speech source signals as a code vector. A unit dictionary memory stores a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector in the speech source codebook and a shift number for the code vector to decode the speech source signal. A unit selection section selects a synthesis unit corresponding to phonemic symbols to be synthesized from the unit dictionary memory. A synthesis unit decoder selects the code vector corresponding to the index in the synthesis unit from the speech source signal codebook, and shifts the code vector according to the shift number in the synthesis unit.
14 Citations
23 Claims
-
1. Speech synthesis method for synthesizing a speech signal by filtering a speech source signal through a synthesis filter, comprising the steps of:
-
storing a plurality of speech source signals as a code vector in a speech source signal codebook;
storing a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector and a shift number for the code vector to decode the speech source signal in a unit dictionary memory;
selecting a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory;
selecting the code vector corresponding to the speech source signal index in the synthesis unit from said speech source signal codebook; and
shifting the code vector according to the shift number in the synthesis unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
further comprising the step of: previously coding said speech signal as the speech source signal index of the code vector, the shift number and a gain value so that said speech signal is almost equals to a synthesized speech signal generated by multiplication of the gain value with the shifted code vector.
-
-
3. The speech synthesis method according to claim 1,
further comprising the step of: -
storing a plurality of gain values as a code vector to decode the speech source signal in a gain codebook;
wherein the synthesis unit includes a gain index of the coded gain in said gain codebook in addition to the index of the code vector in said speech source signal codebook and the shift number.
-
-
4. The speech synthesis method according to claim 3,
further comprising the steps of: -
selecting the gain value corresponding to the gain index in the synthesis unit from said gain codebook; and
multiplying the gain value with the shifted code vector.
-
-
5. The speech synthesis method according to claim 1,
further comprising the step of: -
storing a plurality of coefficients as a code vector, each of which represents characteristics of the synthesis filter to input the speech source signal in a coefficient codebook;
wherein the synthesis unit includes a coefficient index of the code vector in said coefficient codebook in addition to the index of the code vector in said speech source signal codebook and the shift number.
-
-
6. The speech synthesis method according to claim 5,
further comprising the steps of: -
selecting the coefficient corresponding to the coefficient index in the synthesis unit from said coefficient codebook; and
supplying the coefficient to the synthesis filter.
-
-
7. The speech synthesis method according to claim 1,
further comprising the step of; cyclically shifting the code vector according to the shift number.
-
8. The speech synthesis method according to claim 1,
further comprising the steps of: -
selecting the code vector corresponding to the speech source signal index; and
shifting a requantized code vector according to the shift number.
-
-
9. The speech synthesis method according to claim 1,
wherein the shift number is determined to minimize distortion between an original speech signal and a synthesis speech signal generated by the synthesis filter filtering a shifted speech source signal, a coefficient obtained by analyzing the original speech signal being supplied to the synthesis filter. -
10. The speech synthesis method according to claim 1,
wherein the shift number is determined to minimize distortion between a target speech signal generated by a target speech signal synthesis filter filtering the speech source signal and a synthesis speech signal generated by the synthesis filter filtering a shifted speech source signal, a coefficient corresponding to the speech source signal being supplied to the target speech signal synthesis filter and the synthesis filter. -
11. The speech synthesis method according to claim 1,
wherein the shift number is determined so as to match a peak of the speech source signal with a peak of the code vector selected.
-
12. Speech synthesis apparatus for synthesizing a speech signal by filtering a speech source signal through a synthesis filter, comprising:
-
speech source signal codebook means for storing a plurality of speech source signals as a code vector;
unit dictionary memory means for storing a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector in said speech source signal codebook means and a shift number for the code vector to decode the speech source signal;
unit selection means for selecting a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory means; and
synthesis unit decode means for selecting the code vector corresponding to the speech source signal index in the synthesis unit from said speech source signal codebook means, and for shifting the code vector according to the shift number in the synthesis unit. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
wherein said speech signal is previously coded as the speech source signal index of the code vector, the shift number and a gain value so that said speech signal is almost equals to a synthesized speech signal generated by multiplication of the gain value with the shifted code vector. -
14. The speech synthesis apparatus according to claim 12,
further comprising a gain codebook means for storing a plurality of gain values as a code vector to decode the speech source signal, wherein the synthesis unit includes a gain index of the code vector in said gain codebook in addition to the index of the code vector in said speech source signal codebook and the shift number. -
15. The speech synthesis apparatus according to claim 14,
wherein said synthesis unit decode means selects the gain value corresponding to the gain index in the synthesis unit from said gain codebook means, and multiplies the gain value with the shifted code vector. -
16. The speech synthesis apparatus according to claim 12,
further comprising a coefficient codebook means for storing a plurality of coefficients as a code vector, each of which represents characteristics of the synthesis filter to input the speech source signal, wherein the synthesis unit includes a coefficient index of the code vector in said coefficient codebook in addition to the index of the code vector in said speech source signal codebook and the shift number. -
17. The speech synthesis apparatus according to claim 16,
wherein said synthesis unit decode means selects the coefficient corresponding to the coefficient index in the synthesis unit from said coefficient codebook means, and supplies the coefficient to the synthesis filter. -
18. The speech synthesis apparatus according to claim 12.
wherein said synthesis unit decode means cyclically shifts the code vector according to the shift number. -
19. The speech synthesis apparatus according to claim 12,
wherein said synthesis unit decode means selects the code vector corresponding to the speech source signal index, and shifts a requantized code vector according to the shift number. -
20. The speech synthesis apparatus according to claim 12,
wherein the shift number is determined to minimize distortion between an original speech signal and a synthesis speech signal generated by the synthesis filter filtering a shifted speech source signal, a coefficient obtained by analyzing the original speech signal being supplied to the synthesis filter. -
21. The speech synthesis apparatus according to claim 12,
wherein the shift number is determined to minimize distortion between a target speech signal generated by a target speech signal synthesis filter filtering the speech source signal and a synthesis speech signal generated by the synthesis filter filtering a shifted speech source signal, a coefficient corresponding to the speech source signal being supplied to the target speech signal synthesis filter and the synthesis filter. -
22. The speech synthesis apparatus according to claim 12,
wherein the shift number is determined so as to match a peak of the speech source signal with a peak of the code vector selected from said speech source code memory means.
-
-
23. A computer readable memory containing computer-readable instructions to synthesize a speech signal by filtering a speech source signal through a synthesis filter, comprising the steps of:
-
instruction means for causing a computer to store a plurality of speech source signals as a code vector in a speech sorce signal codebook;
instruction means for causing a computer to store a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector and a shift number for the code vector to decode the speech source signal in a unit dictionary memory;
instruction means for causing a computer to select a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory;
instruction means for causing a computer to select the code vector corresponding to the speech source signal index in the synthesis unit from said speech source signal codebook; and
instruction means for causing a computer to shift the code vector according to the shift number in the synthesis unit.
-
Specification