Speech synthesizer
First Claim
1. A speech synthesizer that synthesizes text including a fixed part and a variable part, comprising:
- a recorded speech database that previously stores first speech data being speech data including the fixed part, generated based on recorded speech;
a rule-based synthesizer that generates second speech data including the variable part and at least part of the fixed part from the received text;
a concatenation boundary calculator that selects the position of a concatenation boundary between the recorded speech data and speech data generated by rule-based synthesis, based on acoustic characteristics of a region in which the first speech data and the second speech data that correspond to the text overlap; and
a concatenative synthesizer that synthesizes speech data of the text by concatenating third speech data produced by separating the first speech data in the concatenation boundary, and fourth speech data segmented by separating the second speech data in the concatenation boundary.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a speech synthesizer that generates speech data of text including a fixed part and a variable part, in combination with recorded speech and rule-based synthetic speech. The speech synthesizer is a high-quality one in which recorded speech and synthetic speech are concatenated with the discontinuity of timbres and prosodies not perceived. The speech synthesizer includes: a recorded speech database that previously stores recorded speech data including a recorded fixed part; a rule-based synthesizer that generates rule-based synthetic speech data including a variable part and at least part of the fixed part, from received text; a concatenation boundary calculator that a concatenation boundary position in a region in which the recorded speech data and the rule-based synthetic speech data overlap, based on acoustic characteristics of the recorded speech data and the rule-based synthetic speech data that correspond to the text; a concatenative synthesizer that generates synthetic speech data corresponding to the text by concatenating the recorded speech data and the rule-based synthetic speech data that are segmented in the concatenation boundary position.
-
Citations
21 Claims
-
1. A speech synthesizer that synthesizes text including a fixed part and a variable part, comprising:
-
a recorded speech database that previously stores first speech data being speech data including the fixed part, generated based on recorded speech; a rule-based synthesizer that generates second speech data including the variable part and at least part of the fixed part from the received text; a concatenation boundary calculator that selects the position of a concatenation boundary between the recorded speech data and speech data generated by rule-based synthesis, based on acoustic characteristics of a region in which the first speech data and the second speech data that correspond to the text overlap; and a concatenative synthesizer that synthesizes speech data of the text by concatenating third speech data produced by separating the first speech data in the concatenation boundary, and fourth speech data segmented by separating the second speech data in the concatenation boundary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech synthesizer that synthesizes text including a fixed part and a variable part, comprising:
-
a recorded speech database that previously stores recorded speech data including the recorded fixed part; a rule-based synthesizer that generates rule-based synthetic speech data including the variable part and at least part of the fixed part from the received text; a concatenation boundary calculator that calculates a concatenation boundary position in a region in which the recorded speech data and the rule-based synthetic speech data overlap, based on acoustic characteristics of the recorded speech data and the rule-based synthetic speech data that correspond to the text; and a concatenative synthesizer that concatenates the recorded speech data and the rule-based synthetic speech data that are segmented in the concatenation boundary position, to generate synthetic speech data corresponding to the text. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A speech synthesizer that synthesizes text including a fixed part and a variable part, comprising:
-
a recorded speech database that previously stores recorded speech data including the recorded fixed part; a rule-based synthetic parameter calculator that calculates rule-based synthetic parameters including the variable part and at least part of the fixed part from the received text to generate acoustic characteristics of rule-based synthetic speech; a concatenation boundary calculator that calculates a concatenation boundary position in a region in which the recorded speech data and the rule-based synthetic speech data overlap, using acoustic characteristics of the recorded speech data and acoustic characteristics of the rule-based synthetic speech data; a rule-based synthetic speech data part that generates rule-based synthetic speech data by using acoustic characteristics of the recorded speech, acoustic characteristics of the rule-based synthetic speech, and the concatenation boundary position; a concatenative synthesizer that concatenates the recorded speech data and the rule-based synthetic speech data that are segmented in the concatenation boundary position, to generate synthetic speech data corresponding to the text; and a means that outputs the synthetic speech data.
-
-
20. A speech synthesizer that creates synthetic speech by concatenating a speech block including a variable part and a speech block including a fixed part, previously recorded, comprising:
-
a recorded speech database that stores speech data including the speech blocks previously recorded; an input parser that generates intermediate code of a speech block of the variable part, and intermediate code of a speech block of the fixed part, from received input text; a recorded speech selector that selects appropriate recorded speech data from among plural recorded speech data having the same fixed part according to the input of the variable part; a rule-based synthesizer that uses intermediate code of a speech block of the variable part obtained by the input parser, and intermediate code of a speech block of the fixed part that are obtained in the input parser to determine the range of generating rule-based synthetic speech data; a concatenation boundary calculator that calculates a concatenation boundary position in a region in which the recorded speech data and the rule-based synthetic speech data overlap, using acoustic characteristics of the recorded speech data and acoustic characteristics of the rule-based synthetic speech data; a concatenative synthesizer that uses the concatenation boundary position obtained from the concatenation boundary calculator to cut off the recorded speech data and the rule-based synthetic speech data, and generates synthetic speech data corresponding to a speech block including the variable part by concatenating the recorded speech data and the rule-based synthetic speech data that are cut off; and a speech block concatenator that concatenates speech blocks, based on the order of speech blocks obtained from the input text, and creates output speech.
-
-
21. A speech synthesizing method comprising:
-
a first step of previously storing recorded speech data and first intermediate code corresponding to the recorded speech data to prepare for input text; a second step of converting the input text into second intermediate code; a third step of referring to the first intermediate code to distinguish the second intermediate code into a fixed part corresponding to the first intermediate code and a variable part not corresponding to it; a fourth step of acquiring a part of the first intermediate code that corresponds to the fixed part, from the recorded speech data; a fifth step of using the second intermediate code to generate rule-based synthetic speech data of the whole of a part corresponding to the variable part and at least part of a part corresponding to the fixed part; and a sixth step of concatenating the acquired part of the recorded speech data and part of the generated rule-based synthetic speech data.
-
Specification