Voice synthesis apparatus and method
First Claim
1. A voice synthesis apparatus comprising:
- a voice segment acquisition section that acquires a voice segment including one or more phonemes;
a boundary designation section that designates a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired by the voice segment acquisition section,wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, andwherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point later than the stationary point; and
a voice synthesis section that synthesizes a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme,wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment,wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment preceding the boundary designated by the boundary designation section, andwherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment succeeding the boundary designated by the boundary designation section.
1 Assignment
0 Petitions
Accused Products
Abstract
A plurality of voice segments, each including one or more phonemes are acquired in a time-serial manner, in correspondence with desired singing or speaking words. As necessary, a boundary is designated between start and end points of a vowel phoneme included in any one of the acquired voice segments. Voice is synthesized for a region of the vowel phoneme that precedes the designated boundary vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary in the vowel phoneme. By synthesizing a voice for the region preceding the designated boundary, it is possible to synthesize a voice imitative of a vowel sound that is uttered by a person and then stopped to sound with his or her mouth kept opened. Further, by synthesizing a voice for the region succeeding the designated boundary, it is possible to synthesize a voice imitative of a vowel sound that is started to sound with the mouth opened.
22 Citations
9 Claims
-
1. A voice synthesis apparatus comprising:
-
a voice segment acquisition section that acquires a voice segment including one or more phonemes; a boundary designation section that designates a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired by the voice segment acquisition section, wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, and wherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point later than the stationary point; and a voice synthesis section that synthesizes a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme, wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment, wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment preceding the boundary designated by the boundary designation section, and wherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment succeeding the boundary designated by the boundary designation section. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage section storing a computer program executable by a computer for synthesizing a voice, the computer program including computer executable instructions for:
-
acquiring a voice segment including one or more phonemes; designating a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired in the voice segment acquiring instruction, wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designating instruction designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, and wherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designating instruction designates, as the boundary, a time point later than the stationary point; and synthesizing a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme, wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment, wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesizing instruction instructs to synthesize the voice based on the region of the voice segment preceding the boundary designated by the boundary designating instruction, and wherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesizing instruction instructs to synthesize the voice based on the region of the voice segment succeeding the boundary designated by the boundary designating instruction.
-
-
9. A voice synthesis method for synthesizing a voice using a voice synthesizing apparatus comprising a voice segment acquisition section, a boundary designation section, and a voice synthesis section, the method comprising the steps of:
-
acquiring a voice segment including one or more phonemes with the voice segment acquisition section; designating a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired in the voice segment acquiring step with the boundary designation section, wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designating step designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, and wherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designating step designates, as the boundary, a time point later than the stationary point; and synthesizing a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme with the voice synthesis section, wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment, wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesizing step synthesizes the voice based on the region of the voice segment preceding the boundary designated in the boundary designating step, and wherein when the acquired voice segment where the region including the start point is a vowel phoneme, the voice synthesizing step synthesizes the voice based on the region of the voice segment succeeding the boundary designated in the boundary designating step.
-
Specification