Speech synthesis device, speech synthesis method, and program
5 Assignments
0 Petitions
Accused Products
Abstract
A simply configured speech synthesis device and the like for producing a natural synthetic speech at high speed. When data representing a message template is supplied, a voice unit editor (5) searches a voice unit database (7) for voice unit data on a voice unit whose sound matches a voice unit in the message template. Further, the voice unit editor (5) predicts the cadence of the message template and selects, one at a time, a best match of each voice unit in the message template from the voice unit data that has been retrieved, according to the cadence prediction result. For a voice unit for which no match can be selected, an acoustic processor (41) is instructed to supply waveform data representing the waveform of each unit voice. The voice unit data that is selected and the waveform data that is supplied by the acoustic processor (41) are combined to generate data representing a synthetic speech.
-
Citations
41 Claims
-
1-22. -22. (canceled)
-
23. A speech synthesis device, the device comprising:
-
a first storage means for storing a plurality of pieces of voice unit data representative of one or more speech words;
a selection means for selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first storage means;
a missing part synthesis means, for a speech word among the sentence information for which the selection means could not select the voice unit data, for synthesizing speech data representative of a desired speech waveform; and
a synthesis means for combining the voice unit data selected from the selection means and the speech data synthesized by the missing part synthesis means to create data representative of a synthesis speech corresponding to the sentence information, wherein the missing part synthesis means has a second storage means for storing a plurality of pieces of data representative of one or more pitches of voice waveform fragments; and
wherein data representative of voice waveform fragments composing the speech word whose voice unit data could not be selected is acquired from the second storage means and the acquired data is mutually combined to synthesize the speech data representative of the desired speech waveform. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 39, 40, 41)
-
-
34. A speech synthesis device, the device comprising:
-
a first storage means for storing a plurality of pieces of voice unit data representative of one or more speech words;
a selection means for selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first storage means;
a missing part synthesis means, for a speech word among the sentence information for which the selection means could not select the voice unit data, for synthesizing speech data representative of a desired speech waveform; and
a synthesis means for combining the voice unit data selected from the selection means and the speech data synthesized by the missing part synthesis means to create data representative of a synthesis speech corresponding to the sentence information, wherein the first storage means stores phonetic data representative of a reading of the voice unit data with the phonetic data being associated with the voice unit data, and wherein the selection means operates to handle voice unit data which is associated with phonetic data representative of a reading matching with the reading of the speech word composing the sentence information as voice unit data whose reading is common with the speech word.
-
-
35. A speech synthesis method, the method comprising the steps of:
-
storing a plurality of pieces of voice unit data representative of one or more speech words in a first memory;
selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first memory;
synthesizing a missing part, for a speech word among the sentence information for which the voice unit data could not be selected in the selecting step, by synthesizing speech data representative of a desired speech waveform; and
combining the voice unit data selected from the selection means and the speech data synthesized in the missing part synthesizing step to create data representative of a synthesis speech corresponding to the sentence information, wherein the missing part synthesizing step stores a plurality of pieces of data representative of one or more pitches of voice waveform fragments using a second memory; and
wherein data representative of voice waveform fragments composing the speech word whose voice unit data could not be selected is acquired from the second memory and the acquired data is combined to synthesize the speech data representative of the desired speech waveform.
-
-
36. A speech synthesis method, the method comprising the steps of:
-
storing a plurality of pieces of voice unit data representative of one or more speech words in a first memory;
selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first memory;
synthesizing a missing part, for a speech word among the sentence information for which the selection means could not select the voice units data, by synthesizing speech data representative of a desired speech waveform; and
combining the voice unit data selected from the selection means and the speech data synthesized in the missing part synthesis step to create data representative of a synthesis speech corresponding to the sentence information, wherein the first memory stores phonetic data representative of a reading of the voice unit data with the phonetic data being associated with the voice unit data, and wherein the selecting step handles voice unit data which is associated with phonetic data representative of a reading matching with the reading of the speech word composing the sentence information as voice unit data whose reading is common with the speech word.
-
-
37. A computer program causing a computer to operate as:
-
a first storage means for storing a plurality of pieces of voice unit data representative of one or more speech words;
a selection means for selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first storage means;
a missing part synthesis means, for a speech word among the sentence information for which the selection means could not select the voice units data, for synthesizing speech data representative of a desired speech waveform; and
a synthesis means for combining the voice unit data selected from the selection means and the speech data synthesized by the missing part synthesis means to create data representative of a synthesis speech corresponding to the sentence information, wherein the missing part synthesis means has a second storage means for storing a plurality of pieces of data representative of one or more pitches of voice waveform fragments; and
wherein data representative of voice waveform fragments composing the speech word whose voice unit data could not be selected is acquired from the second storage means and the acquired data is mutually combined to synthesize the speech data representative of the desired speech waveform.
-
-
38. A computer program causing a computer to operate as:
-
a first storage means for storing a plurality of pieces of voice unit data representative of one or more speech words;
a selection means for selecting voice unit data whose reading is common with a speech word composing inputted sentence information from the plurality of pieces of voice unit data stored in the first storage means;
a missing part synthesis means, for a speech word among the sentence information for which the selection means could not select the voice units data, for synthesizing speech data representative of a desired speech waveform; and
a synthesis means for combining the voice unit data selected from the selection means and the speech data synthesized by the missing part synthesis means to create data representative of a synthesis speech corresponding to the sentence information, wherein the first storage means stores phonetic data representative of a reading of the voice unit data with the phonetic data being associated with the voice unit data, and wherein the selection means operates to handle voice unit data which is associated with phonetic data representative of a reading matching with the reading of the speech word composing the sentence information as voice unit data whose reading is common with the speech word.
-
Specification