Method and apparatus for text-to-voice audio output with accent control and improved phrase control
First Claim
1. An audio output unit for expressing a temporal change pattern of a fundamental frequency of an output voice using a sum of a phrase component corresponding to an intonation of the output voice and an accent component corresponding to a basic accent of the output voice, wherein the temporal change pattern of the fundamental frequency includes linguistic information such as basic accent, emphasis, intonation, and syntax, the phrase component is approximated by a response characteristic of a first secondary linear system to an impulsive phrase command, the accent component is approximated by a response characteristic of a second secondary linear system to a step accent command, and the temporal change pattern of the fundamental frequency is expressed on a logarithmic scale, the audio output unit comprising:
- a storage section for storing analyzed information pertaining to an input character list, the analyzed information including a word, a boundary between articulations, and a basic accent;
a voice synthesis rule section including a phrase component characteristic control section for controlling a reduction or damping characteristic of a phrase component of a fundamental frequency in order to control a response characteristic of a first secondary linear system to a phrase command used in calculating the phrase component, the reduction or damping characteristic being any of an underdamped characteristic, a critically-damped characteristic, and an overdamped characteristic, and for generating a temporal change pattern of the fundamental frequency in accordance with the calculated phrase component; and
a voice synthesizing section for generating a composite tone using synthesized waveform data generated in accordance with predetermined phonemic rules from the voice synthesis rule section and the temporal change pattern of the fundamental frequency from the voice synthesis rule section based on the analyzed information from the storage section.
1 Assignment
0 Petitions
Accused Products
Abstract
A text-to-voice audio output unit includes a storage section for storing analyzed information pertaining to words, boundaries between articulations, and accents obtained by analyzing an input character list, a voice synthesis rule section for changing a reduction or damping characteristic of a phrase component of a fundamental frequency of an output voice, and a voice synthesizing section for generating a composite tone based on the analyzed information from the storage section. The reduction or damping characteristic, calculated for each phrase component, is overdamped, critically damped, or underdamped and is based on speech rate, syntactic information, number of articulations, and positional information. When a prosodic phrase is short, the reduction or damping characteristic causes a decrease in the fundamental frequency for a meaningfully-delimited portion, and when a prosodic phrase is long, the reduction or damping characteristic is controlled over the entire prosodic phrase.
64 Citations
4 Claims
-
1. An audio output unit for expressing a temporal change pattern of a fundamental frequency of an output voice using a sum of a phrase component corresponding to an intonation of the output voice and an accent component corresponding to a basic accent of the output voice, wherein the temporal change pattern of the fundamental frequency includes linguistic information such as basic accent, emphasis, intonation, and syntax, the phrase component is approximated by a response characteristic of a first secondary linear system to an impulsive phrase command, the accent component is approximated by a response characteristic of a second secondary linear system to a step accent command, and the temporal change pattern of the fundamental frequency is expressed on a logarithmic scale, the audio output unit comprising:
-
a storage section for storing analyzed information pertaining to an input character list, the analyzed information including a word, a boundary between articulations, and a basic accent; a voice synthesis rule section including a phrase component characteristic control section for controlling a reduction or damping characteristic of a phrase component of a fundamental frequency in order to control a response characteristic of a first secondary linear system to a phrase command used in calculating the phrase component, the reduction or damping characteristic being any of an underdamped characteristic, a critically-damped characteristic, and an overdamped characteristic, and for generating a temporal change pattern of the fundamental frequency in accordance with the calculated phrase component; and a voice synthesizing section for generating a composite tone using synthesized waveform data generated in accordance with predetermined phonemic rules from the voice synthesis rule section and the temporal change pattern of the fundamental frequency from the voice synthesis rule section based on the analyzed information from the storage section. - View Dependent Claims (2)
-
-
3. A method for outputting a composite tone by expressing a temporal change pattern of a fundamental frequency of an output voice using a sum of a phrase component corresponding to an intonation of the output voice and an accent component corresponding to a basic accent of the output voice, wherein the temporal change pattern of the fundamental frequency includes linguistic information such as basic accent, emphasis, intonation, and syntax, the phrase component is approximated by a response characteristic of a first secondary linear system to an impulsive phrase command, the accent component is approximated by a response characteristic of a second secondary linear system to a step accent command, and the temporal change pattern of the fundamental frequency is expressed on a logarithmic scale, the method comprising the steps of:
-
storing analyzed information including a word, a boundary between articulations, and a basic accent, wherein the analyzed information is obtained by analyzing an input character list; changing a reduction or damping characteristic of a phrase component of a fundamental frequency in order to control a response characteristic of a first secondary linear system to a phrase command used in calculating the phrase component, the reduction or damping characteristic being any of an underdamped characteristic, a critically-damped characteristic, and an overdamped characteristic; generating a temporal change pattern of the fundamental frequency in accordance with the calculated phrase components; and generating a composite tone using synthesized waveform data generated in accordance with predetermined phonemic rules and the temporal change pattern of the fundamental frequency based on the analyzed information. - View Dependent Claims (4)
-
Specification