Device and method for synthesizing speech

US 6,975,987 B1
Filed: 10/04/2000
Issued: 12/13/2005
Est. Priority Date: 10/06/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech synthesis device comprising:

speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information;

speech waveform composing means for dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, for obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, and for generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units; and

analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals;

wherein the speech waveform composing means comprises pitch converting means for converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data,at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides pitch conversion processing technology capable of minimizing the distortion of speech sound naturalness. A speech waveform in a pitch-unit is considered to be divided into two segments: 1) the segment of β, that starts from the minus peak, where the waveform depending on the shape of vocal tracts appears, and 2) the segment of γ where the waveform depending on the vocal tract shape is attenuating and converging on the next minus peak. In addition, α is the point where a minus peak appears along with the glottal closure. Based on characteristics of speech waveforms, the present invention processes waveform for converting pitch in the segment of γ just before the next minus peak, which is least affected by the minus peak associated with the glottal closure. As such, waveform processing can be performed by keeping the complete contour of waveform at around the peak, and thereby reducing the effects of pitch conversion.

Citations

12 Claims

1. A speech synthesis device comprising:
- speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information;
  
  speech waveform composing means for dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized, for obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database, and for generating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units; and
  
  analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals;
  
  wherein the speech waveform composing means comprises pitch converting means for converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data,at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The speech synthesis device of claim 1, wherein, within the segment in which the waveform is converging on the minus peak, a largest processing value is provided at around a zero crossing point and a smaller value is provided at a point farther from the zero crossing point.
  - 3. The speech synthesis device of claim 1, wherein pitch is one of shortened and lengthened by one of compressing and extending, respectively, the waveform along a time axis in the segment in which the waveform is converging on the minus peak.
  - 4. The speech synthesis device of claim 1, wherein waveform processing at around zero crossing point is performed within the segment in which the waveform is converging on the minus peak.
  - 5. The speech synthesis device of claim 1, wherein waveform processing at around zero crossing point is performed by one of inserting a substantial zero value segment to lengthen pitch and eliminating a substantial zero value segment to shorten pitch.

6. A computer-readable storing medium storing a program for executing pitch conversion using a computer having speech database storing means for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information, the program comprising the step of:
- dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized,obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database,converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data, at said segment, the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak, andgenerating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units.
- View Dependent Claims (7, 8, 9)
- - 7. The storing medium of claim 6, wherein, within the segment in which waveform is converging on the minus peak, a largest processing value is provided at around a zero crossing point and a smaller value is provided at a point farther from the zero crossing point.
  - 8. The storing medium of claim 6, wherein pitch is one of shortened and lengthened and lengthened by one of compressing and extending, respectively, the waveform along a time axis in the segment in which the waveform is converging on the minus peak.
  - 9. The storing medium of claim 6, wherein waveform processing at around a zero crossing point is performed within the segment in which the waveform is converging on the minus peak.

10. A speech synthesis device comprising:
- speech database storing means for storing a speech database having several sample speech waveform data with various pitch lengths for each speech unit and phonetic information associated with the sample waveform data;
  
  speech waveform composing means for dividing phonetic information into speech units upon receiving phonetic information of speech sound to be synthesized, for obtaining a desirable sample speech waveform data from among the sample speech waveform data corresponding to the divided phonetic information in a speech unit in the speech database, and for generating speech waveform data to be composed by means of concatenating the obtained sample speech waveform data in speech units; and
  
  analog converting means for converting the speech waveform data received from the speech waveform composing means into analog signals;
  
  wherein the speech database is constructed of several sample speech waveform data with various pitch lengths prepared by modifying a contour of a waveform in a segment in which the waveform is converging on the minus peak during a periodical unit of speech waveform data.

11. A computer-readable storing medium storing a program for executing speech synthesis by means of a computer using a speech database, the program comprising the steps of:
- receiving phonetic information of speech sound to be synthesized and dividing phonetic information into speech units;
  
  obtaining a desirable sample speech waveform data from among sample speech waveform data corresponding to the divided phonetic information in a speech unit in the speech database; and
  
  generating speech waveform data to be composed by means of concatenating the obtained sample speech waveform data in speech units;
  
  wherein the speech database is constructed of several sample speech waveform data with various pitch lengths prepared by modifying a contour of a waveform in a segment in which the waveform is converging on a minus peak during a periodical unit of speech waveform data.

12. A method of pitch conversion for speech waveform, the method comprising the steps of:
- preparing speech database for storing sample waveform data in a speech unit and a speech database created by way of associating the sample sound waveform data with their corresponding phonetic information,dividing phonetic information into speech units upon receiving the phonetic information of speech sound to be synthesized,obtaining sample speech waveform data corresponding to the each phonetic information in a speech unit from the speech database,converting pitch by means of processing a segment of a waveform in which the waveform is converging on a segment just before a minus peak during a periodical unit of speech waveform data, at said segment the speech waveform being depending on vocal tract shape and being attending and converging on the minus peak, andgenerating speech waveform data to be composed by means of concatenating the sample speech waveform data in speech units.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Arcadia Group Limited (Taveta Limited)
Original Assignee
Arcadia Group Limited (Taveta Limited)
Inventors
Tenpaku, Seiichi, Hirai, Toshio
Primary Examiner(s)
Lerner, Martin

Application Number

US09/678,544
Time in Patent Office

1,896 Days
Field of Search

704/205, 704/207, 704/258, 704/260, 704/265, 704/267, 704/268, 704/269
US Class Current

704/258
CPC Class Codes

G10L 13/033 Voice editing, e.g. manipul...

Device and method for synthesizing speech

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Device and method for synthesizing speech

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links