Method and apparatus for speech analysis and synthesis by sampling a power spectrum of input speech
First Claim
Patent Images
1. A method for speech analysis and synthesis comprising the steps of:
- sampling a short-period power spectrum of speech input into an apparatus with a sampling frequency to obtain sample points, said sampling frequency being controlled so as to trace a basic frequency of input voiced speech;
applying a cosine polynomial model to the thus obtained sample points to determine a spectrum envelope;
calculating mel cepstrum coefficients from the spectrum envelope; and
effecting speech synthesis utilizing the mel cepstrum coefficients as filter coefficients of a mel logarithmic spectrum approximation filter used for speech synthesis.
0 Assignments
0 Petitions
Accused Products
Abstract
A method for speech analysis and synthesis for obtaining synthesized speech of a high quality includes the steps of determining a short-period power spectrum by performing an FFT operation on a speech wave, sampling the spectrum at the positions corresponding to the multiples of a basic frequency, applying a cosine polynomial model to the thus obtained sample points to determine the spectrum envelope thereat, then calculating the mel cepstrum coefficients from the spectrum envelope, and effecting speech synthesis, utilizing the mel cepstrum coefficients as the filter coefficients in a synthesizing (logarithmic mel spectrum approximation) filter.
168 Citations
10 Claims
-
1. A method for speech analysis and synthesis comprising the steps of:
-
sampling a short-period power spectrum of speech input into an apparatus with a sampling frequency to obtain sample points, said sampling frequency being controlled so as to trace a basic frequency of input voiced speech; applying a cosine polynomial model to the thus obtained sample points to determine a spectrum envelope; calculating mel cepstrum coefficients from the spectrum envelope; and effecting speech synthesis utilizing the mel cepstrum coefficients as filter coefficients of a mel logarithmic spectrum approximation filter used for speech synthesis. - View Dependent Claims (2, 3, 4)
-
-
5. A method for speech analysis comprising the steps of:
-
inputting a speech wave form into an apparatus; extracting a power spectrum from the speech wave form inputted in said inputting step; extracting pitch information of the input voiced speech from the power spectrum extracted in said power spectrum extracting step; sampling the power spectrum extracted in said power spectrum extracting step with a sampling interval to produce sample data, said sampling interval being controlled so as to vary in accordance with a pitch interval of the input voiced speech extracted in said pitch information extracting step; generating a spectrum envelope from the sample data obtained in said sampling step; and transmitting the kind of the voiced speech, the pitch information and said spectrum envelope as parameters of the input speech.
-
-
6. An apparatus for speech analysis and synthesis comprising:
-
means for sampling a short-period power spectrum of speech input into said apparatus with a sampling frequency to obtain sample points, said sampling frequency being controlled so as to trace a basic frequency of input voiced speech; means for applying a cosine polynomial model to the thus obtained sample points to determine a spectrum envelope; means for calculating mel cepstrum coefficients from the spectrum envelope; and means for effecting speech synthesis utilizing the mel cepstrum coefficients as filter coefficients of a mel logarithmic spectrum approximation filter used for speech synthesis. - View Dependent Claims (7, 8, 9)
-
-
10. An apparatus for speech analysis comprising:
-
means for inputting a speech wave form into an apparatus; means for extracting a power spectrum from the speech wave form inputted by said inputting means; means for extracting pitch information of the input voiced speech from the power spectrum extracted by said power spectrum extracting means; means for sampling the power spectrum extracted by said power spectrum means with a sampling interval to produce sample data, said sampling interval being controlled so as to vary in accordance with a pitch interval of the input voiced speech extracted by said pitch information extracting means; means for generating a spectrum envelope from the sample data obtained by said sampling means; and means for transmitting the kind of the voiced speech, the pitch information and said spectrum envelope as parameters of the input speech.
-
Specification