Speech synthesis apparatus and method
First Claim
1. Speech synthesis apparatus comprising:
- a dialog-style selection arrangement responsive to at least one factor affecting intelligibility of speech output as heard by a user, to select a dialog style intended to provide at least a minimum level of intelligibility;
a speech-application text provider arranged to provide text-form utterances for a current speech application in the dialog style selected by the selection arrangement;
a text-to-speech converter arranged to convert text-form utterances received from the speech-application text provider into speech form and arranged to generate the said at least one factor; and
wherein the selection arrangement is operative to select a dialog style intended to balance intelligibility and naturalness whilst maintaining said minimum level of intelligibility whereby changes in said at least one factor indicating improved intelligibility of speech output lead to changes in dialog style in favor of naturalness whilst changes in said at least one factor indicating reduced intelligibility of speech output lead to changes in dialog style in favor of intelligibility.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech synthesiser is provided with a dialog-style selection arrangement responsive to a factor affecting intelligibility of speech output by the apparatus to select a dialog style intended to provide at least a minimum level of intelligibility of speech output by the synthesiser. The selected dialog style is used by a speech-application text provider when generating text-form utterances for a current speech application, these text-form utterances then being converted into speech form by a text-to-speech converter. The factor affecting intelligibility may be a measure of the intelligibility of the speech-form output or an environmental factor such as background noise in the user'"'"'s environment.
39 Citations
18 Claims
-
1. Speech synthesis apparatus comprising:
-
a dialog-style selection arrangement responsive to at least one factor affecting intelligibility of speech output as heard by a user, to select a dialog style intended to provide at least a minimum level of intelligibility; a speech-application text provider arranged to provide text-form utterances for a current speech application in the dialog style selected by the selection arrangement; a text-to-speech converter arranged to convert text-form utterances received from the speech-application text provider into speech form and arranged to generate the said at least one factor; and wherein the selection arrangement is operative to select a dialog style intended to balance intelligibility and naturalness whilst maintaining said minimum level of intelligibility whereby changes in said at least one factor indicating improved intelligibility of speech output lead to changes in dialog style in favor of naturalness whilst changes in said at least one factor indicating reduced intelligibility of speech output lead to changes in dialog style in favor of intelligibility. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of generating speech output for a current speech application comprising the steps of:
-
(a) in dependence on at least one factor affecting intelligibility of speech output as heard by a user, dynamically selecting a dialog style intended to provide at least a minimum level of intelligibility; (b) providing text-form utterances for a current speech application in the dialog style selected in step (a); and (c) converting the text-form utterances into speech form and generating the said at least one factor based on converting the text-form utterances into speech form; and wherein step (a) is effected in a manner so as to balance intelligibility and naturalness whilst maintaining said minimum level of intelligibility whereby changes in said at least one factor indicating improved intelligibility of speech output lead to changes in dialog style in favor of naturalness whilst changes in said at least one factor indicating reduced intelligibility of speech output lead to changes in dialog style in favor of intelligibility. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. Speech synthesis apparatus comprising:
-
a dialog-style selection arrangement responsive to at least one factor affecting intelligibility of speech output as heard by a user, to select a dialog style intended to provide at least a minimum level of intelligibility; a speech-application text provider arranged to provide text-form utterances for a current speech application in the dialog style selected by the selection arrangement; a text-to-speech converter arranged to convert text-form utterances received from the speech-application text provider into speech form and arranged to generate the said at least one factor; and wherein the said at least one factor is a measure of the intelligibility of the speech form actually produced by the text-to-speech converter, wherein the text-to-speech converter is arranged to generate, in the course of converting a text-form utterance into speech form, values of predetermined features that are indicative of the intelligibility of the speech form of the utterance, the selection arrangement comprising; a classifier responsive to the feature values generated by the text-to-speech converter to provide a measure of the intelligibility of the speech form of the utterance concerned; and a comparator for comparing the measure produced by the classifier against one or more stored threshold values, in order to select the dialog style.
-
-
18. A method of generating speech output for a current speech application comprising the steps of:
-
(a) in dependence on at least one factor affecting intelligibility of speech output as heard by a user, dynamically selecting a dialog style intended to provide at least a minimum level of intelligibility; (b) providing text-form utterances for a current speech application in the dialog style selected in step (a); and (c) converting the text-form utterances into speech form and generating the said at least one factor based on converting the text-form utterances into speech form; and wherein step (a) is effected in a manner so as to balance intelligibility and naturalness whilst maintaining said minimum level of intelligibility whereby changes in said at least one factor indicating improved intelligibility of speech output lead to changes in dialog style in favor of naturalness whilst changes in said at least one factor indicating reduced intelligibility of speech output lead to changes in dialog style in favor of intelligibility; wherein the said at least one factor is a measure of the intelligibility of the speech form actually produced by the text-to-speech conversion; wherein step (c) involves generating in the course of converting a text-form utterance into speech form, values of predetermined features that are indicative of the intelligibility of the speech form of the utterance, step (a) involving; using a classifier responsive to the said values of predetermined features to provide a measure of the intelligibility of the speech form of the utterance concerned; and comparing the measure produced by the classifier against one or more stored threshold values, in order to select the dialog style.
-
Specification