Speech synthesizer, speech synthesizing method, and program
First Claim
1. A speech synthesizer comprising:
- a target parameter generation unit operable to generate target parameters on an element-by-element basis from information containing at least phonetic symbols, the target parameters being a parameter group through which speech can be synthesized;
a speech element database which stores, on an element-by-element basis, pre-recorded speech as speech elements that are made up of a parameter group in the same format as the target parameters;
an element selection unit operable to select, from said speech element database, a speech element that corresponds to the target parameters;
a parameter group synthesis unit operable to synthesize the parameter group of the target parameters and the parameter group of the speech element by finding the similarity per dimension of the target parameters and the speech element, selecting, based on the similarity per dimension, the speech element in the case where the target parameters and the speech element are judged as being similar and select, based on the similarity per dimension, the target parameters in the case where the target parameters and the speech element are judged as not being similar, and integrating the parameter groups on an element-by-element basis; and
a waveform generation unit operable to generate a synthetic speech waveform based on the synthesized parameter groups.
4 Assignments
0 Petitions
Accused Products
Abstract
A speech synthesizer that provides high-quality sound along with stable sound quality, including: a target parameter generation unit; a speech element DB; an element selection unit; a mixed parameter judgment unit which determines an optimum parameter combination of target parameters and speech elements; a parameter integration unit which integrates the parameters; and a waveform generation unit which generates synthetic speech. High-quality and stable synthetic speech is generated by combining, per parameter dimension, the parameters with stable sound quality generated by the target parameter generation unit with speech elements with high sound quality and a sense of true speech selected by the element selection unit.
-
Citations
10 Claims
-
1. A speech synthesizer comprising:
-
a target parameter generation unit operable to generate target parameters on an element-by-element basis from information containing at least phonetic symbols, the target parameters being a parameter group through which speech can be synthesized; a speech element database which stores, on an element-by-element basis, pre-recorded speech as speech elements that are made up of a parameter group in the same format as the target parameters; an element selection unit operable to select, from said speech element database, a speech element that corresponds to the target parameters; a parameter group synthesis unit operable to synthesize the parameter group of the target parameters and the parameter group of the speech element by finding the similarity per dimension of the target parameters and the speech element, selecting, based on the similarity per dimension, the speech element in the case where the target parameters and the speech element are judged as being similar and select, based on the similarity per dimension, the target parameters in the case where the target parameters and the speech element are judged as not being similar, and integrating the parameter groups on an element-by-element basis; and a waveform generation unit operable to generate a synthetic speech waveform based on the synthesized parameter groups. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech synthesizing method comprising:
-
a step of generating target parameters on an element-by-element basis from information containing at least phonetic symbols, the target parameters being a parameter group through which speech can be synthesized; a step of selecting a speech element that corresponds to the target parameters, from a speech element database which stores, on an element-by-element basis, pre-recorded speech as speech elements that are made up of a parameter group in the same format as the target parameters; a step of synthesizing the parameter group of the target parameters and the parameter group of the speech element by finding the similarity per dimension of the target parameters and the speech element, selecting, based on the similarity per dimension, the speech element in the case where the target parameters and the speech element are judged as being similar and select, based on the similarity per dimension, the target parameters in the case where the target parameters and the speech element are judged as not being similar, and integrating the parameter groups on an element-by-element basis; and a step of generating a synthetic speech waveform based on the synthesized parameter groups.
-
-
10. A program stored on computer storage memory which causes a computer to execute steps for speech synthesizing, the steps comprising:
-
a step of generating target parameters on an element-by-element basis from information containing at least phonetic symbols, the target parameters being a parameter group through which speech can be synthesized; a step of selecting a speech element that corresponds to the target parameters, from a speech element database which stores, on an element-by-element basis, pre-recorded speech as speech elements that are made up of a parameter group in the same format as the target parameters; a step of synthesizing the parameter group of the target parameters and the parameter group of the speech element by finding the similarity per dimension of the target parameters and the speech element, selecting, based on the similarity per dimension, the speech element in the case where the target parameters and the speech element are judged as being similar and select, based on the similarity per dimension, the target parameters in the case where the target parameters and the speech element are judged as not being similar, and integrating the parameter groups on an element-by-element basis; and a step of generating a synthetic speech waveform based on the synthesized parameter groups.
-
Specification