System and method for synthesizing dialog-style speech using speech-act information
First Claim
1. A system for synthesizing a dialog-style speech using speech-act information, comprising:
- a preprocessing module for performing a normalization of an input sentence in order to preprocess the input sentence;
a linguistic module for performing a morphological tagging operation and a speech-act tagging operation for the preprocess-completed input sentence, discriminating whether a predetermined expression whose intonation should be selectively realized is included in the speech-act tagging-completed input sentence, and performing a tagging operation for the predetermined expression using an intonation tagging table where intonation tags are set so as to correspond to linguistic information extracted from a dialog context including a preceding sentence and a following sentence if the predetermined expression is included in the input sentence;
a prosodic module for giving an intonation;
a unit selector for extracting a marked relevant speech segment appropriate for an intonation tag of the expression in the input sentence; and
a speech generator for connecting a speech segment and another speech segment to generate and output a dialog-style synthesized speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and a method for synthesizing a dialog-style speech using speech-act information are provided. According to the system and the method, tagging for discriminating an intonation is performed for expressions whose intonations need to be differently realized depending on a dialog context in a dialog text using speech-act information extracted (from the) sentences uttered by two speakers having a dialog. When a speech is synthesized, a speech signal having an intonation appropriate for the tag is extracted from a speech DB and used, so that natural and various intonations appropriate for a dialog flow can be realized. Therefore, an aspect of an interaction in a dialog can be well expressed and thus improvement of naturalness in a dialog speech can be expected.
20 Citations
6 Claims
-
1. A system for synthesizing a dialog-style speech using speech-act information, comprising:
-
a preprocessing module for performing a normalization of an input sentence in order to preprocess the input sentence;
a linguistic module for performing a morphological tagging operation and a speech-act tagging operation for the preprocess-completed input sentence, discriminating whether a predetermined expression whose intonation should be selectively realized is included in the speech-act tagging-completed input sentence, and performing a tagging operation for the predetermined expression using an intonation tagging table where intonation tags are set so as to correspond to linguistic information extracted from a dialog context including a preceding sentence and a following sentence if the predetermined expression is included in the input sentence;
a prosodic module for giving an intonation;
a unit selector for extracting a marked relevant speech segment appropriate for an intonation tag of the expression in the input sentence; and
a speech generator for connecting a speech segment and another speech segment to generate and output a dialog-style synthesized speech. - View Dependent Claims (2)
-
-
3. A method for synthesizing a dialog-style speech using speech-act information, wherein an intonation tagging is performed by rules extracted in a statistical way using a context information consisting of speech-act information which is an analysis unit of a dialog represented in a preceding and a following utterances for predetermined words or sentences having the same form and whose intonations need to be realized differently depending on their meaning, and an intonation appropriate for a meaning and a dialog context is realized using a speech segment appropriate for a relevant tag when a speech is synthesized.
-
4. A method for synthesizing a dialog-style speech using speech-act information, comprising the steps of:
-
(a) performing a morphological tagging operation and a speech-act tagging operation for a preprocess-completed input sentence;
(b) discriminating whether a predetermined expression whose intonation should be selectively realized is included in the speech-act tagging-completed input sentence;
(c) if the predetermined expression is included in the input sentence, performing a tagging operation for the predetermined expression using an intonation tagging table where intonation tags are set so as to correspond to linguistic information extracted from a dialog context including a preceding sentence and a following sentence;
(d) extracting a relevant speech segment from a synthesis unit database (DB) where a speech segment appropriated for an intonation of the tagging-completed predetermined expression is marked; and
(e) connecting a speech segment and another speech segment to generate a dialog-style synthesized speech. - View Dependent Claims (5, 6)
-
Specification