Phase vocoder speech synthesis system
First Claim
1. Apparatus for synthesizing a natural sounding speech message from phase vocoder stored signals representative of a vocabulary of words comprising:
- means for selectively extracting preselected locations of said stored signals for constructing a predetermined sequence of signals representative of said speech message;
means for altering the pitch parameters of said extracted signals; and
means for combining said pitch modified signals.
0 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system for synthesizing speech from stored signals representative of words precoded in accordance with phase vocoder techniques. The stored signals comprise short-time Fourier transform parameters which describe the magnitude and phase derivative of the short-time signal spectrum. Speech synthesis is achieved by extracting the stored signals of chosen words under control of a duration factor signal, by concatenating the extracted signals, by operating on the phase derivative parameters to effect a desired speech pitch change, by interpolating the magnitude parameters of the short-time Fourier transform in response to the pitch and duration changes, and by decoding the resultant signals in accordance with phase vocoder techniques.
-
Citations
15 Claims
-
1. Apparatus for synthesizing a natural sounding speech message from phase vocoder stored signals representative of a vocabulary of words comprising:
-
means for selectively extracting preselected locations of said stored signals for constructing a predetermined sequence of signals representative of said speech message; means for altering the pitch parameters of said extracted signals; and means for combining said pitch modified signals.
-
-
2. Apparatus for synthesizing a natural sounding speech message comprising:
-
means for storing phase vocoder signals representative of a vocabulary of words; first means, responsive to an applied duration control signal, for selectively extracting from said means for storing preselected signals to form a duration modified sequence of signals representative of said speech message; means for altering the pitch parameters of said extracted signals; and means for combining said signals modified in pitch and duration to form a sum signal for activating a speech synthesizer.
-
-
3. Apparatus for generating natural sounding synthesized speech comprising:
-
a memory for storing phase vocoder encoded signals representative of a vocabulary of words; means for extracting signals from selected storage locations of said memory to affect the duration of said synthesized speech; means for altering the pitch parameters of said extracted signals to affect the pitch of said synthesized speech; and means for phase vocoder decoding of said altered signals to form said synthesized speech signal.
-
-
4. A system for synthesizing speech messages from phase vocoder encoded word signals stored in a memory comprising:
-
means for extracting selected signals from said memory a repeated number of times to affect the duration of said speech messages; means for altering the pitch parameters of said extracted signals; and means for decoding said pitch and duration altered signals to form said speech messages.
-
-
5. A system for composing speech messages from phase vocoder encoded and stored words comprising:
-
means for extracting selected signals from said encoded stored words a repeated number of times to affect the duration of said composed speech; means for altering the pitch parameters of said extracted signals; means for interpolating the spectrum parameters of said extracted signals; and means for decoding said interpolated and pitch altered signals to form a composed speech message signal.
-
-
6. Apparatus for synthesizing natural sounding speech comprising:
-
a phase vocoder analyzer responsive to an applied vocabulary of words; means for storing the output signals of said analyzer; means for extracting the signals of selected storage locations in said means for storing; means for modifying the pitch parameters of said extracted signals; and means for converting said pitch modified signals in accordance with phase vocoder techniques to develop a natural sounding speech signal.
-
-
7. Apparatus for processing phase vocoder type representations of selected prerecorded spoken words to form a description of a desired message suitable for actuating a speech synthesizer to develop synthesized speech, which comprises:
-
first means, for encoding said prerecorded words in accordance with phase vocoder techniques to form short-time Fourier transform signal vectors and phase derivative signal vectors; second means, for storing said phase derivative and said short-time Fourier transform signal vectors; third means, for extracting selected locations of said stored signals a preselected number of times of control the duration of said synthesized speech; fourth means, for modifying said phase derivative signal vectors to control the pitch of said synthesized speech; fifth means, for interpolating the shorttime Fourier transform signal vectors in accordance with predetermined rules responsive to an applied duration control signal and to the modified phase derivative signal vectors to effect a smooth spectrum envelope; and sixth means, for combining said modified phase derivative signal vector and said spectrum interpolated short-time Fourier transform signal vector in accordance with phase vocoder techniques to form a synthesized speech signal suitable for actuating said speech synthesizer. - View Dependent Claims (8, 9, 10)
-
-
11. Apparatus for processing phase vocoder type representations of selected prerecorded spoken words to form a description of a desired message suitable for actuating a speech synthesizer to develop synthesized speech, which comprises:
-
first means, for encoding said prerecorded words in accordance with phase vocoder techniques to form short-time Fourier transform signal vectors and phase derivative signal vectors; second means, for storing said phase derivative and said short-time Fourier transform signal vectors; third means, for extracting selected locations of said stored signals a preselected number of times to control the duration of said synthesized speech; fourth means, for modifying said phase derivative signal vectors to control the pitch of said synthesized speech; and fifth means, for combining said modified phase derivative signal vector and said duration controlled short-time Fourier transformed signal vector in accordance with phase vocoder techniques to form a synthesized speech signal suitable for actuating said speech synthesizer.
-
-
12. A method for synthesizing a natural sounding speech message from phase vocoder stored signals representative of a vocabulary of words comprising the steps of:
-
selectively extracting preselected locations of said stored signals for the construction of a predetermined sequence of signals representative of said speech message; altering the pitch parameters of said extracted signals; and combining said pitch modified signals.
-
-
13. A method for synthesizing a natural sounding speech message comprising the steps of:
-
storing phase vocoder signals representative of a vocabulary of words; selectively extracting from said stored signals preselected signals forming a duration modified predetermined sequence of signals representative of said speech message; altering the pitch parameters of said extracted signals; and combining said pitch and function modified signals to form a sum signal for activating a speech synthesizer.
-
-
14. A method for composing speech message from phase vocoder encoded and stored words comprising the steps of:
-
extracting selected signals from said encoded stored words a repeated number of times to affect the duration of synthesized speech; altering the pitch parameters of said extracted signals; interpolating the spectrum parameters of said extracted signal; and phase vocoder decoding of said interpolated and pitch and duration altered signals to form a speech message signal.
-
-
15. A method for processing phase vocoder type representations of selected prerecorded spoken words to form a description of a desired message suitable for actuating a speech synthesizer to develop synthesized speech, which comprises the steps of:
-
encoding said prerecorded words in accordance with phase vocoder techniques to form short-time Fourier transform signal vectors and phase derivative signal vectors; storing said phase derivative and said short-time Fourier transform signal vectors; extracting selected locations of said stored signals a preselected number of times to control the duration of said synthesized speech; modifying said phase derivative signal vectors to control the pitch of said synthesized speech; interpolating the short-time Fourier transform signal vectors in accordance with predetermined rules responsive to an applied duration control signal and to the modified phase derivative signal vectors to effect a smooth spectrum envelope; and combining said modified phase derivative signal vectors and said spectrum interpolated shorttime Fourier transform signal vectors in accordance with phase vocoder techniques to form a synthesized speech signal suitable for actuating said speech synthesizer.
-
Specification