Training and applying prosody models
First Claim
Patent Images
1. A computer-implementable method for synthesizing audible speech, with varying prosody, from textual content, the method comprising:
- maintaining an inventory of prosody models with lexicons,selecting a subset of multiple prosody models from the inventory of prosody models;
associating prosody models in the subset of multiple prosody models with different segments of a text based on phrases in the text statistically associated with the lexicons of the prosody models;
applying the associated prosody models to the different segments of the text to produce prosody annotations of the text;
considering annotations of the prosody annotations to reconcile conflicting prosody annotations of the text previously produced by multiple prosody models associated with a segment of the text; and
synthesizing audible speech from the text and the reconciled prosody annotations.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.
25 Citations
6 Claims
-
1. A computer-implementable method for synthesizing audible speech, with varying prosody, from textual content, the method comprising:
-
maintaining an inventory of prosody models with lexicons, selecting a subset of multiple prosody models from the inventory of prosody models; associating prosody models in the subset of multiple prosody models with different segments of a text based on phrases in the text statistically associated with the lexicons of the prosody models; applying the associated prosody models to the different segments of the text to produce prosody annotations of the text; considering annotations of the prosody annotations to reconcile conflicting prosody annotations of the text previously produced by multiple prosody models associated with a segment of the text; and synthesizing audible speech from the text and the reconciled prosody annotations. - View Dependent Claims (2, 3, 4, 5, 6)
-
Specification