Voice synthesis apparatus and method

US 7,552,052 B2
Filed: 07/13/2005
Issued: 06/23/2009
Est. Priority Date: 07/15/2004
Status: Active Grant

First Claim

Patent Images

1. A voice synthesis apparatus comprising:

a voice segment acquisition section that acquires a voice segment including one or more phonemes;

a boundary designation section that designates a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired by the voice segment acquisition section,wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, andwherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point later than the stationary point; and

a voice synthesis section that synthesizes a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme,wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment,wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment preceding the boundary designated by the boundary designation section, andwherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment succeeding the boundary designated by the boundary designation section.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A plurality of voice segments, each including one or more phonemes are acquired in a time-serial manner, in correspondence with desired singing or speaking words. As necessary, a boundary is designated between start and end points of a vowel phoneme included in any one of the acquired voice segments. Voice is synthesized for a region of the vowel phoneme that precedes the designated boundary vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary in the vowel phoneme. By synthesizing a voice for the region preceding the designated boundary, it is possible to synthesize a voice imitative of a vowel sound that is uttered by a person and then stopped to sound with his or her mouth kept opened. Further, by synthesizing a voice for the region succeeding the designated boundary, it is possible to synthesize a voice imitative of a vowel sound that is started to sound with the mouth opened.

22 Citations

View as Search Results

9 Claims

1. A voice synthesis apparatus comprising:
- a voice segment acquisition section that acquires a voice segment including one or more phonemes;
  
  a boundary designation section that designates a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired by the voice segment acquisition section,wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, andwherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designation section designates, as the boundary, a time point later than the stationary point; and
  
  a voice synthesis section that synthesizes a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme,wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment,wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment preceding the boundary designated by the boundary designation section, andwherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesis section synthesizes the voice based on the region of the voice segment succeeding the boundary designated by the boundary designation section.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A voice synthesis apparatus as claimed in claim 1, wherein:
    - the acquired voice segment includes a first voice segment where the region including the end point is a vowel phoneme, and a second voice segment following the first voice segment where the region of the start point is a vowel phoneme,for each of the first and second voice segments, the boundary designation section designates the boundary in the vowel phoneme, andthe voice synthesis section synthesizes voices for the region of the first voice segment preceding the boundary designated by the boundary designation section, and for the region of the second voice segment succeeding the designated boundary.
  - 3. A voice synthesis apparatus as claimed in claim 1, wherein:
    - a the voice segment is divided into a plurality of frames, andthe voice synthesis section interpolates between the frame of a first voice segment immediately preceding the boundary designated by the boundary designation section and the frame of a second voice segment immediately succeeding the boundary designated by the boundary designation section, to thereby generate a voice for a gap between the frames.
  - 4. A voice synthesis apparatus as claimed in claim 1, further comprising a time data acquisition section that acquires time data designating a duration time length of the voice, andwherein the boundary designation section designates the boundary in the vowel phoneme, included in the voice segment, at a time point corresponding to the duration time length designated by the time data.
  - 5. A voice synthesis apparatus as claimed in claim 4, wherein:
    - when the acquired voice segment where the region including the end point is a vowel phoneme, boundary designation section designates the boundary at a time point, in the vowel phoneme included in the voice segment, closer to the end point as a longer time length is designated by the time data, andthe voice synthesis section synthesizes the voice based on a region of the vowel phoneme that precedes the designated boundary in said vowel phoneme.
  - 6. A voice synthesis apparatus as claimed in claim 4, wherein:
    - when the acquired voice segment where the region including the start point is a vowel phoneme, the boundary designation section designates the boundary at a time point, in the vowel phoneme included in the voice segment, closer to the start point as a longer time length is designated by the time data, andthe voice synthesis section synthesizes the voice based on a region of the vowel phoneme that succeeds the designated boundary in the vowel phoneme.
  - 7. A voice synthesis apparatus as claimed in claim 1, further comprising an input section that receives a parameter input thereto, andwherein the boundary designation section designates the boundary at a time point, of the vowel phoneme included in the voice segment acquired by the phoneme acquisition section, corresponding to the parameter input to the input section.

8. A computer-readable storage section storing a computer program executable by a computer for synthesizing a voice, the computer program including computer executable instructions for:
- acquiring a voice segment including one or more phonemes;
  
  designating a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired in the voice segment acquiring instruction,wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designating instruction designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, andwherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designating instruction designates, as the boundary, a time point later than the stationary point; and
  
  synthesizing a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme,wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment,wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesizing instruction instructs to synthesize the voice based on the region of the voice segment preceding the boundary designated by the boundary designating instruction, andwherein when the acquires voice segment where the region including the start point is a vowel phoneme, the voice synthesizing instruction instructs to synthesize the voice based on the region of the voice segment succeeding the boundary designated by the boundary designating instruction.

9. A voice synthesis method for synthesizing a voice using a voice synthesizing apparatus comprising a voice segment acquisition section, a boundary designation section, and a voice synthesis section, the method comprising the steps of:
- acquiring a voice segment including one or more phonemes with the voice segment acquisition section;
  
  designating a boundary intermediate between start and end positions of a vowel phoneme included in the voice segment acquired in the voice segment acquiring step with the boundary designation section,wherein when the acquired voice segment where a region including an end point is a vowel phoneme, the boundary designating step designates, as the boundary, a time point earlier than a stationary point, which is a boundary point between a region where a waveform amplitude of the voice segment is substantially constant and a region where the waveform amplitude of the voice segment varies, andwherein when the acquired voice segment where a region including a start point is a vowel phoneme, the boundary designating step designates, as the boundary, a time point later than the stationary point; and
  
  synthesizing a voice based on a region of the vowel phoneme that precedes the designated boundary of the vowel phoneme, or a region of the vowel phoneme that succeeds the designated boundary of the vowel phoneme with the voice synthesis section,wherein the start point and the end point of the vowel phoneme and the designated boundary of the vowel phoneme are time points on a time axis of the acquired voice segment,wherein when the acquired voice segment where the region including the end point is a vowel phoneme, the voice synthesizing step synthesizes the voice based on the region of the voice segment preceding the boundary designated in the boundary designating step, andwherein when the acquired voice segment where the region including the start point is a vowel phoneme, the voice synthesizing step synthesizes the voice based on the region of the voice segment succeeding the boundary designated in the boundary designating step.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Yamaha Corporation
Original Assignee
Yamaha Corporation
Inventors
Kemmochi, Hideki
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US11/180,108
Publication Number

US 20060015344A1
Time in Patent Office

1,441 Days
Field of Search

704/258, 704/260, 704/265, 704/267, 704/268, 704/269, 704/E13.001, 704/E13.002, 704/E13.004, 704/E13.005, 704/E13.008, 704/E13.011, 704/E13.014
US Class Current

704/258
CPC Class Codes

G10L 13/033   Voice editing, e.g. manipul...

G10L 13/04   Details of speech synthesis...

G10L 13/06   Elementary speech units use...

Voice synthesis apparatus and method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

22 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Voice synthesis apparatus and method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

22 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links