Method and apparatus for speech synthesis, program, recording medium, method and apparatus for generating constraint information and robot apparatus
First Claim
1. A speech synthesis method for receiving information on the emotion to synthesize the speech, comprising:
- a prosodic data forming step of forming prosodic data from a string of pronunciation marks which is based on an uttered text, uttered as speech;
a constraint information generating step of generating the constraint information used for maintaining prosodic features of the uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on said prosodic data the parameters of which have been changed in said parameter changing step.
1 Assignment
0 Petitions
Accused Products
Abstract
The emotion is to be added to the synthesized speech as the prosodic feature of the language is maintained. In a speech synthesis device 200, a language processor 201 generates a string of pronunciation marks from the text, and a prosodic data generating unit 202 creates prosodic data, expressing the time duration, pitch, sound volume or the like parameters of phonemes, based on the string of pronunciation marks. A constraint information generating unit 203 is fed with the prosodic data and with the string of pronunciation marks to generate the constraint information which limits the changes in the parameters to add the so generated constraint information to the prosodic data. A emotion filter 204, fed with the prosodic data, to which has been added the constraint information, changes the parameters of the prosodic data, within the constraint, responsive to the feeling state information, imparted to it. A waveform generating unit 205 synthesizes the speech waveform based on the prosodic data the parameters of which have been changed.
-
Citations
63 Claims
-
1. A speech synthesis method for receiving information on the emotion to synthesize the speech, comprising:
-
a prosodic data forming step of forming prosodic data from a string of pronunciation marks which is based on an uttered text, uttered as speech;
a constraint information generating step of generating the constraint information used for maintaining prosodic features of the uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on said prosodic data the parameters of which have been changed in said parameter changing step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A speech synthesis method for receiving information on the emotion to synthesize the speech, comprising:
-
a data inputting step for inputting prosodic data which is based on the text uttered as speech and the constraint information for maintaining the prosodic feature of said uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on the prosodic data the parameters of which have been changed in said parameter changing step. - View Dependent Claims (18, 19)
-
-
20. A speech synthesis apparatus for receiving information on the emotion to synthesize the speech, comprising:
-
prosodic data generating means for generating prosodic data from a string of pronunciation marks which is based on a text uttered as speech;
constraint information generating means for generating the constraint information adapted for maintaining the prosodic feature of said uttered text;
parameter changing means for changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
speech synthesis means for synthesizing the speech based on said prosodic data the parameters of which have been changed by said parameter changing means. - View Dependent Claims (21)
-
-
22. A speech synthesis apparatus for receiving information on the emotion to synthesize the speech, comprising:
-
data inputting means for inputting prosodic data which is based on the uttered text uttered as speech, and the constraint information for maintaining the prosodic feature of said uttered text;
parameter changing means for changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
speech synthesis means for synthesizing the speech based on said prosodic data the parameters of which have been changed in said parameter changing means. - View Dependent Claims (23)
-
-
24. A program product for having a computer execute the processing for receiving information on the emotion to synthesize the speech, comprising:
-
a prosodic data forming step of forming prosodic data from a string of pronunciation marks which is based on an uttered text, uttered as speech;
a constraint information generating step of generating the constraint information used for maintaining prosodic features of the uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on said prosodic data the parameters of which have been changed in said parameter changing step. - View Dependent Claims (25)
-
-
26. A program product loadable into a computer for having the computer perform the processing of receiving information on the emotion to synthesize the speech, comprising:
-
a data inputting step for inputting prosodic data which is based on the test uttered as speech and the constraint information for maintaining the prosodic feature of said uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on the prosodic data the parameters of which have been changed in said parameter changing step. - View Dependent Claims (27)
-
-
28. A computer-readable recording medium on which there is recorded a program for having a computer execute the processing of receiving information on the emotion to synthesize the speech, comprising:
-
a prosodic data forming step of forming prosodic data from a string of pronunciation marks which is based on an uttered text, uttered as speech;
a constraint information generating step of generating the constraint information used for maintaining prosodic features of the uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on said prosodic data the parameters of which have been changed in said parameter changing step. - View Dependent Claims (29)
-
-
30. A recording medium on which there is recorded a program adapted for having a computer perform the processing of receiving information on the emotion to synthesize the speech, comprising:
-
a data inputting step for inputting prosodic data which is based on the text uttered as speech and the constraint information for maintaining the prosodic feature of said uttered text;
a parameter changing step of changing parameters of said prosodic data, in consideration of said constraint information, responsive to the information on the emotion; and
a speech synthesis step of synthesizing the speech based on the prosodic data, the parameters of which have been changed in said parameter changing step. - View Dependent Claims (31)
-
-
32. A method for generating the constraint information comprising:
a constraint information generating step of being fed with a string of pronunciation marks specifying an uttered text, uttered as speech, for generating the constraint information for maintaining the prosodic feature of said uttered text when changing parameters of prosodic data prepared from said string of pronunciation marks in accordance with the parameter change control information. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
45. An apparatus for generating the constraint information comprising:
constraint information generating means for being fed with a string of pronunciation marks specifying an uttered text, uttered as speech, for generating the constraint information for maintaining the prosodic feature of said uttered text when changing parameters of prosodic data prepared from said string of pronunciation marks in accordance with the parameter change control information. - View Dependent Claims (46, 47)
-
48. An autonomous robot apparatus performing a movement based on the input information supplied thereto, comprising:
-
an emotion model ascribable to said movement;
emotion discrimination means for discriminating the emotion state of said emotion model;
prosodic data creating means for creating prosodic data from a string of pronunciation marks which is based on the text uttered as speech;
constraint information generating means for generating the constraint information adapted for maintaining the prosodic feature of said uttered text;
parameter changing means for changing parameters of said prosodic data, in consideration of said constraint information, responsive to the emotion state discriminated by said discriminating means; and
speech synthesizing means for synthesizing the speech based on said prosodic data the parameters of which have been changed by the parameter changing means. - View Dependent Claims (49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60)
-
-
61. An autonomous robot apparatus performing a movement based on the input information supplied thereto, comprising:
-
an emotion model ascribable to said movement;
emotion discrimination means for discriminating the emotion state of said emotion model;
data inputting means for inputting prosodic data which is based on the text uttered as speech and the constraint information for maintaining the prosodic feature of said uttered text;
parameter changing means for changing parameters of said prosodic data, in consideration of said constraint information, responsive to the emotion state discriminated by said discriminating means; and
speech synthesizing means for synthesizing the speech based on said prosodic data, the parameters of which have been changed by the parameter changing means. - View Dependent Claims (62, 63)
-
Specification