Speech segment coding and pitch control methods for speech synthesis systems

US 5,617,507 A
Filed: 07/14/1994
Issued: 04/01/1997
Est. Priority Date: 11/06/1991
Status: Expired due to Fees

First Claim

Patent Images

1. A speech coding method for use in speech synthesis, comprising:

obtaining a set of spectral envelope parameters that represents an estimated spectral envelope of a voiced speech signal by using a spectrum estimation technique;

deconvolving said voiced speech signal, with an impulse response that is a time-domain representation of said estimated spectral envelope of said voiced speech signal, into a pitch pulse train signal having a sequence of periodically located pitch pulses;

forming an excitation signal by appending zero-valued samples to each pitch pulse signal of one period such that one pitch pulse is contained in each period;

convolving said excitation signal with said impulse response into wavelets;

obtaining wavelet codes by coding the wavelets of all periods; and

storing in memory wavelet codes and information of corresponding pitch pulse locations of all wavelets, for use in speech synthesis.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to a method and system for synthesizing speech utilizing a periodic waveform decomposition and relocation coding scheme. According to the scheme, signals of voiced sound interval among original speech are decomposed into wavelets, each of which corresponds to a speech waveform for one period made by each glottal pulse. These wavelets are respectively coded and stored. The wavelets nearest to the positions where the wavelets are to be located are selected from stored wavelets and decoded. The decoded wavelets are superposed to each other such that original sound quality can be maintained and duration and pitch frequency of speech segment can be controlled arbitrarily.

324 Citations

8 Claims

1. A speech coding method for use in speech synthesis, comprising:
- obtaining a set of spectral envelope parameters that represents an estimated spectral envelope of a voiced speech signal by using a spectrum estimation technique;
  
  deconvolving said voiced speech signal, with an impulse response that is a time-domain representation of said estimated spectral envelope of said voiced speech signal, into a pitch pulse train signal having a sequence of periodically located pitch pulses;
  
  forming an excitation signal by appending zero-valued samples to each pitch pulse signal of one period such that one pitch pulse is contained in each period;
  
  convolving said excitation signal with said impulse response into wavelets;
  
  obtaining wavelet codes by coding the wavelets of all periods; and
  
  storing in memory wavelet codes and information of corresponding pitch pulse locations of all wavelets, for use in speech synthesis.
- View Dependent Claims (2, 3, 4, 5)
- - 2. A speech synthesis method in a speech synthesis system which uses the speech coding method of claim 1, comprising:
    - determining appropriate time points which represent a desired pitch pattern;
      
      selecting from all wavelet codes a wavelet code whose pitch pulse location is nearest to each of said time points;
      
      obtaining a wavelet signal by decoding each selected wavelet code;
      
      localizing said wavelet signal so that the pitch pulse location of said wavelet signal coincides with said time point; and
      
      superposing all of said localized wavelet signals, thereby obtaining a synthetic speech.
  - 3. The speech coding method of claim 1 wherein a wavelet code is formed by mating information obtained by coding said pitch pulse signal of one period, with information obtained by coding a set of said spectral envelope parameters of the same period as the one period of said pitch pulse signal.
  - 4. A speech synthesis method in a speech synthesis system which uses the speech coding method of claim 3, comprising:
    - determining appropriate time points which represent a desired pitch pattern;
      
      selecting from all wavelet codes a wavelet code whose pitch pulse location is nearest to each of said time points;
      
      decoding a coded pitch pulse signal and a set of coded spectral envelope parameters of each selected wavelet code;
      
      forming an excitation signal by appending zero-valued samples after each decoded pitch pulse signal;
      
      obtaining a wavelet signal by convolving said excitation signal with an impulse response which is a time-domain representation of a set of said decoded spectral envelope parameters;
      
      localizing said wavelet signal so that pitch pulse location of said wavelet signal coincides with said time point; and
      
      superposing all of said localized wavelet signals, thereby obtaining a synthetic speech.
  - 5. A speech synthesis method in a speech synthesis system which uses the speech coding method of claim 3, comprising:
    - determining appropriate time points which represent a desired pitch pattern;
      
      selecting from all wavelet codes a wavelet code whose pitch pulse location is nearest to each of said time points;
      
      decoding a coded pitch pulse signal and a set of coded spectral envelope parameters in each selected wavelet code;
      
      localizing said decoded pitch pulse signal so that the pitch pulse location of said decoded pitch pulse signal coincides with said time point;
      
      forming an excitation signal by superposing all of said localized pitch pulse signals; and
      
      convolving said excitation signal with an impulse response which is a time-domain representation of a set of said decoded spectral envelope parameters, thereby obtaining a synthetic speech.

6. A speech coding method for use in speech synthesis, comprising:
- obtaining a set of spectral envelope parameters of a voice speech signal by spectrum estimation;
  
  deconvolving the voice speech signal, with an impulse response that is representative of the spectral envelope parameters set of the voice speech signal, into a pitch pulse train signal having a plurality of pitch pulses;
  
  forming an excitation signal by segmenting the pitch pulse train signal such that one pitch pulse is contained in each period;
  
  convolving the excitation signal with the impulse response into a plurality of wavelets; and
  
  storing the plurality of wavelets for use in speech synthesis.
- View Dependent Claims (7)
- - 7. The speech coding method of claim 6 wherein the step of forming an excitation signal further includes the step of appending zero-valued samples to each segmented pitch pulse train signal of one period.

8. A speech coding method for use in speech synthesis, comprising:
- obtaining a set of spectral envelope parameters of a voice speech signal by spectrum estimation;
  
  deconvolving the voice speech signal, with an impulse response that is representative of the set of spectral envelope parameters, into a pitch pulse train signal having a substantially flat spectral envelope and a sequence of periodically located pitch pulses;
  
  forming an excitation signal by adding zero-valued samples to each pitch pulse train signal of one period such that one pitch pulse is contained in each period;
  
  convolving the excitation signal with the impulse response into wavelets with each wavelet being associated with one pitch pulse; and
  
  storing the wavelets and the locations of the associated pitch pulses in memory for use in speech synthesis.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Korea Telecommunication Authority
Original Assignee
Korea Telecommunication Authority
Inventors
Park, Yong K., Lee, Chong R.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
CHOWDHURY, INDRINAL

Application Number

US08/275,940
Time in Patent Office

992 Days
Field of Search

395/2, 395/2.09, 395/2.91, 395/2.94, 381/29-53
US Class Current

704/200
CPC Class Codes

G10L 13/04   Details of speech synthesis...

G10L 19/09   Long term prediction, i.e. ...

G10L 21/04   Time compression or expansion

G10L 25/27   characterised by the analys...

Speech segment coding and pitch control methods for speech synthesis systems

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

324 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Speech segment coding and pitch control methods for speech synthesis systems

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

324 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links