Speech synthesis apparatus and method
First Claim
1. A speech synthesis apparatus comprising:
- means for storing, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and parameter data obtained by analyzing a speech corresponding to the fixed form portion;
means, responsive to an instruction by a user, for selecting data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data;
means for generating parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information; and
means for concatenating the generated parameter data of the unfixed form portion to the stored parameter data of the fixed form portion, and generating synthesized speech from the concatenated parameter data.
1 Assignment
0 Petitions
Accused Products
Abstract
A text segment selection unit extracts parameters of exemplary text segment of a user'"'"'s choice and a fixed form portion in the exemplary text segment from an exemplary text segment database. A text segment input unit inputs a text segment of a user'"'"'s choice to e embedded to an unfixed form portion in the exemplary text segment. A text segment generation unit concatenates the input text segment to the text segment of the fixed form portion. A parameter generation unit generates a parameter from the concatenated text segment. A parameter extraction unit extracts the parameter of the unfixed form portion from the generated parameter. A parameter embedding unit concatenates the parameter of the unfixed form portion to the parameter of the fixed form portion to generate a parameter for speech synthesis. A synthesis unit generates synthesized speech from this parameter. With this arrangement, more natural synthesis can be realized without any sense of incongruous prosody between the synthesis-by-rule portion and the analysis portion.
19 Citations
20 Claims
-
1. A speech synthesis apparatus comprising:
-
means for storing, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and parameter data obtained by analyzing a speech corresponding to the fixed form portion;
means, responsive to an instruction by a user, for selecting data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data;
means for generating parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information; and
means for concatenating the generated parameter data of the unfixed form portion to the stored parameter data of the fixed form portion, and generating synthesized speech from the concatenated parameter data. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A speech synthesis apparatus comprising:
-
means for storing, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context connected with the unfixed form portion and speech waveform data of the fixed form portion;
information relating to the fixed form portion to bemeans, responsive to an instruction by a user, for selecting data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data;
means for generating parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information, and generating synthesized speech from the generated parameter data; and
means for concatenating speech waveform data of the generated synthesized speech of the unfixed form portion to the stored speech waveform data of the fixed form portion, and generating synthesized speech from the concatenated speech waveform data. - View Dependent Claims (7, 8, 9)
-
-
10. A speech synthesis method comprising the steps of:
-
providing a database for storing, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and parameter data obtained by analyzing a speech corresponding to the fixed form portion;
in response to an instruction by a user, selecting data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data;
generating parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information; and
concatenating the generated parameter data of the unfixed form portion to the stored parameter data of the fixed form portion, and generating synthesized speech from the concatenated parameter data. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A speech synthesis method comprising the steps of:
-
providing a database for storing, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and speech waveform data of the fixed form portion;
in response to an instruction by a user, selecting data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data;
generating parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information, and generating synthesized speech from the generated parameter data; and
concatenating speech waveform data of the generated synthesized speech of the unfixed form portion to the stored speech waveform data of the fixed form portion, and generating synthesized speech from the concatenated speech waveform data. - View Dependent Claims (16, 17, 18)
-
-
19. A storage medium storing computer-executable program code for performing speech synthesis, the program code comprising:
-
means for causing a computer to store on a database, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and parameter data obtained by analyzing a speech corresponding to the fixed form portion;
means for causing a computer to select data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data in response to an instruction by a user;
means for causing a computer to generate parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information; and
means for causing a computer to concatenate the generated parameter data of the unfixed form portion to the stored parameter data of the fixed form portion, and generate synthesized speech from the concatenated parameter data.
-
-
20. A storage medium storing computer-executable program code for performing speech synthesis, the program code comprising;
-
means for causing a computer to store on a database, for each exemplary text segment containing a fixed form portion having a fixed text segment and an unfixed form portion on which an arbitrary text segment can be specified by a user, exemplary text segment data including context information relating to the fixed form portion to be connected with the unfixed form portion and speech waveform data of the fixed form portion;
means for causing a computer to select data from among the exemplary text segment data and inputting a text segment corresponding to the unfixed form portion of the selected exemplary text segment data in response to an instruction by a user;
means for causing a computer to generate parameter data of at least the unfixed form portion on the basis of the inputted text segment of the unfixed form portion and corresponding context information, and generate synthesized speech from the generated parameter data; and
means for causing a computer to concatenate speech waveform data of the generated synthesized speech of the unfixed form portion to the stored speech waveform data of the fixed form portion, and generate synthesized speech from the concatenated speech waveform data.
-
Specification