System-effected text annotation for expressive prosody in speech synthesis and recognition
First Claim
1. A computer-implemented rule-based method of synthesizing speech from text, the method comprising:
- inputting text to be machine spoken to a computerized system;
system identifying of segmental units and suprasegmental units in the text;
system annotating of the text to indicate the system-identified segmental units; and
system generation of synthesized speech modulated according to the annotations in the text;
wherein the system annotating of the text comprises identifying of discourse-givenness, contrastiveness, and/or cue phrase lookups to identify and annotate text with discourse prominence or discourse non-prominence.
1 Assignment
0 Petitions
Accused Products
Abstract
The inventive system can automatically annotate the relationship of text and acoustic units for the purposes of: (a) predicting how the text is to be pronounced as expressively synthesized speech, and (b) improving the proportion of expressively uttered speech as correctly identified text representing the speaker'"'"'s message. The system can automatically annotate text corpora for relationships of uttered speech for a particular speaking style and for acoustic units in terms of context and content of the text to the utterances. The inventive system can use kinesthetically defined expressive speech production phonetics that are recognizable and controllable according to kinesensic feedback principles. In speech synthesis embodiments of the invention, the text annotations can specify how the text is to be expressively pronounced as synthesized speech. Also, acoustically-identifying features for dialects or mispronunciations can be identified so as to expressively synthesize alternative dialects or stylistic mispronunciations for a speaker from a given text. In speech recognition embodiments of the invention, each text annotation can be uniquely identified from the corresponding acoustic features of a unit of uttered speech to correctly identify the corresponding text. By employing a method of rules-based text annotation, the invention enables expressiveness to be altered to reflect syntactic, semantic, and/or discourse circumstances found in text to be synthesized or in an uttered message.
-
Citations
20 Claims
-
1. A computer-implemented rule-based method of synthesizing speech from text, the method comprising:
-
inputting text to be machine spoken to a computerized system; system identifying of segmental units and suprasegmental units in the text; system annotating of the text to indicate the system-identified segmental units; and system generation of synthesized speech modulated according to the annotations in the text; wherein the system annotating of the text comprises identifying of discourse-givenness, contrastiveness, and/or cue phrase lookups to identify and annotate text with discourse prominence or discourse non-prominence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented rule-based method of synthesizing speech from text, the method comprising:
-
inputting text to be machine spoken to a computerized system; system identifying of segmental units and suprasegmental units in the text; system annotating of the text to indicate the system-identified segmental units; and system generation of synthesized speech modulated according to the annotations in the text; wherein the system annotating of the text comprises dividing of a text sentence into groups of meanings and indicating the locations of long and short pauses, employing syntactic constituency, and balance.
-
-
13. A computer-implemented rule-based method of synthesizing speech from text, the method comprising:
-
inputting text to be machine spoken to a computerized system; system identifying of segmental units and suprasegmental units in the text; system annotating of the text to indicate the system-identified segmental units; and system generation of synthesized speech modulated according to the annotations in the text; wherein the system annotating of the text comprises identifying, for each phrase in the text an operative word introducing a new idea to carry the argument forward as the sentences progress, the method employing discourse properties, semantic properties, and/or syntactic properties of relevant words. - View Dependent Claims (14)
-
-
15. A computer-implemented rule-based method of synthesizing speech from text, the method comprising:
-
inputting text to be machine spoken to a computerized system; system identifying of segmental units and suprasegmental units in the text; system annotating of the text to indicate the system-identified segmental units; and system generation of synthesized speech modulated according to the annotations in the text; wherein the system annotating of the text comprises representing intonation contours on a pitch change scale encompassing the pitch range of the speech to be synthesized or recognized. - View Dependent Claims (16)
-
-
17. A computer-implemented rule-based method of recognizing speech, the method comprising:
-
inputting uttered speech to be recognized to a computerized system; system comparison of the uttered speech with acoustic units corresponding with annotated text to facilitate identification of text units corresponding with the uttered speech wherein the annotated text comprises the product of system identifying of segmental units and suprasegmental units in the text and system annotating of the text to indicate the system-identified segmental units; and outputting text recognized as corresponding with the uttered speech. - View Dependent Claims (18, 19, 20)
-
Specification