Speech synthesis method and apparatus, and dictionary generation method and apparatus
First Claim
1. A speech synthesis method comprising:
- an acquisition step of acquiring micro-segments from speech waveform data and a window function;
a re-arrangement step of re-arranging the micro-segments acquired in the acquisition step to change prosody upon synthesis;
a synthesis step of outputting synthetic speech waveform data on the basis of superposed waveform data obtained by superposing the micro-segments re-arranged in the re-arrangement step; and
a correction step of correcting at least one of the speech waveform data, the micro-segments, and the superposed waveform data using a spectrum correction filter formed based on the speech waveform data to be processed in the acquisition step.
1 Assignment
0 Petitions
Accused Products
Abstract
In a speech synthesis process, micro-segments are cut from acquired waveform data and a window function. The obtained micro-segments are re-arranged to implement a desired prosody, and superposed data is generated by superposing the re-arranged micro-segments, so as to obtain synthetic speech waveform data. A spectrum correction filter is formed based on the acquired waveform data. At least one of the waveform data, micro-segments, and superposed data is corrected using the spectrum correction filter. In this way, “blur” of a speech spectrum due to the window function applied to obtain micro-segments is reduced, and speech synthesis with high sound quality is realized.
30 Citations
29 Claims
-
1. A speech synthesis method comprising:
-
an acquisition step of acquiring micro-segments from speech waveform data and a window function;
a re-arrangement step of re-arranging the micro-segments acquired in the acquisition step to change prosody upon synthesis;
a synthesis step of outputting synthetic speech waveform data on the basis of superposed waveform data obtained by superposing the micro-segments re-arranged in the re-arrangement step; and
a correction step of correcting at least one of the speech waveform data, the micro-segments, and the superposed waveform data using a spectrum correction filter formed based on the speech waveform data to be processed in the acquisition step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 29)
-
-
19. A method of generating a dictionary used in a speech synthesis process, comprising:
-
a generation step of generating a spectrum correction filter on the basis of speech waveform data to be stored in a speech synthesis dictionary; and
a storage step of storing the spectrum correction filter generated in the generation step in correspondence with the speech waveform data. - View Dependent Claims (20)
-
-
21. A method of generating a dictionary used in a speech synthesis process, comprising:
-
a first generation step of generating a spectrum correction filter on the basis of each of speech waveform data to be stored in a speech synthesis dictionary;
a second generation step of generating spectrum-corrected speech waveform data by applying the spectrum correction filter to the corresponding speech waveform data; and
a storage step of storing the spectrum-corrected speech waveform data generated in the second generation step in the speech synthesis dictionary.
-
-
22. A method of generating a dictionary used in a speech synthesis process, comprising:
-
a first generation step of generating a replacement filter which replaces a spectrum correction filter obtained based on speech waveform data;
a second generation step of generating modified waveform data by processing the speech waveform data to correct an influence of use of the replacement filter; and
a storage step of storing the replacement filter generated in the first generation step in correspondence with the modified waveform data generated in the second generation step. - View Dependent Claims (23, 24)
-
-
25. A speech synthesis apparatus comprising:
-
acquisition means for acquiring micro-segments from speech waveform data and a window function;
re-arrangement means for re-arranging the micro-segments acquired by said acquisition means to change prosody upon synthesis;
synthesis means for outputting synthetic speech waveform data on the basis of superposed waveform data obtained by superposing the micro-segments re-arranged by said re-arrangement means; and
correction means for correcting at least one of the speech waveform data, the micro-segments, and the superposed waveform data using a spectrum correction filter formed based on the speech waveform data to be processed by said acquisition means.
-
-
26. A generation apparatus of a dictionary used in a speech synthesis process, comprising:
-
generation means for generating a spectrum correction filter on the basis of speech waveform data to be stored in a speech synthesis dictionary; and
storage means for storing the spectrum correction filter generated by said generation means in correspondence with the speech waveform data.
-
-
27. A generation apparatus of a dictionary used in a speech synthesis process, comprising:
-
first generation means for generating a spectrum correction filter on the basis of each of speech waveform data to be stored in a speech synthesis dictionary;
second generation means for generating spectrum-corrected speech waveform data by applying the spectrum correction filter to the corresponding speech waveform data; and
storage means for storing the spectrum-corrected speech waveform data generated by said second generation means in the speech synthesis dictionary.
-
-
28. A generation apparatus of a dictionary used in a speech synthesis process, comprising:
-
first generation means for generating a replacement filter which replaces a spectrum correction filter obtained based on speech waveform data;
second generation means for generating modified waveform data by processing the speech waveform data to correct an influence of use of the replacement filter; and
storage means for storing the replacement filter generated by said first generation means in correspondence with the modified waveform data generated by said second generation means.
-
Specification