Training and applying prosody models

US 8,374,873 B2
Filed: 08/11/2009
Issued: 02/12/2013
Est. Priority Date: 08/12/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implementable method for synthesizing audible speech, with varying prosody, from textual content, the method comprising:

generating first texts annotated with prosody information, the first texts and prosody information generated by a speech recognition engine applied to speech inputs and parameters;

training an inventory of prosody models with the first texts annotated with prosody information, wherein the prosody models are associated with the parameters;

selecting a subset of multiple prosody models from the inventory of prosody models;

associating prosody models in the subset of multiple prosody models with different segments of a second text;

applying the associated prosody models to the different segments of the second text to produce prosody annotations for the second text;

reconciling conflicting prosody annotations from multiple prosody models associated with a segment of the second text; and

synthesizing audible speech from the second text and the reconciled prosody annotations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

25 Citations

View as Search Results

12 Claims

1. A computer-implementable method for synthesizing audible speech, with varying prosody, from textual content, the method comprising:
- generating first texts annotated with prosody information, the first texts and prosody information generated by a speech recognition engine applied to speech inputs and parameters;
  
  training an inventory of prosody models with the first texts annotated with prosody information, wherein the prosody models are associated with the parameters;
  
  selecting a subset of multiple prosody models from the inventory of prosody models;
  
  associating prosody models in the subset of multiple prosody models with different segments of a second text;
  
  applying the associated prosody models to the different segments of the second text to produce prosody annotations for the second text;
  
  reconciling conflicting prosody annotations from multiple prosody models associated with a segment of the second text; and
  
  synthesizing audible speech from the second text and the reconciled prosody annotations.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the parameters comprise the roles and geographical locations of multiple speakers who uttered the speech inputs.
  - 3. The method of claim 2, wherein the parameters further comprise emotional designators.
  - 4. The method of claim 2, wherein the parameters further comprise relative weights.
  - 5. The method of claim 4, wherein the parameters comprise keywords and model selection is based on the keywords in the second text.
  - 6. The method of claim 5, wherein each prosody model is associated with a confidence level.
  - 7. The method of claim 6, wherein the reconciliation is based on a reconciliation policy.
  - 8. The method of claim 7, wherein the reconciliation policy selects annotations based on the confidence levels of the models that produced conflicting annotations for the segment of the second text.

9. A computer-implementable method for synthesizing audible speech, with varying prosody, from textual content, the method comprising:
- selecting multiple prosody models from a prosody model inventory based on keywords in a first text annotated with prosody information and parameters;
  
  training the multiple prosody models with the first text annotated with prosody information;
  
  choosing a subset of multiple prosody models from the prosody model inventory;
  
  associating prosody models in the subset of multiple prosody models with different segments of a second text;
  
  applying the associated prosody models to the different segments of the second text to produce prosody annotations for the second text;
  
  reconciling conflicting prosody annotations for a segment of the second text based on confidence levels and prosody model weights; and
  
  synthesizing audible speech from the second text and the reconciled prosody annotations.
- View Dependent Claims (10, 11, 12)
- - 10. The method of claim 9, wherein a model manager chooses the subset of multiple prosody models and the model manager applies the associated prosody models to the different segments of the second text.
  - 11. The method of claim 10 further comprising:
    - building lexicons of words that are statistically associated with the prosody models and the parameters andanalyzing the second text using the lexicons to inform the association of prosody models with the different segments of the second text.
  - 12. The method of claim 11 wherein the model manager uses the lexicons and input text to select prosody models for model training.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Morphism LLC
Original Assignee
Morphism LLC
Inventors
Stephens, Jr., James H.
Primary Examiner(s)
GUERRA-ERAZO, EDGAR X

Application Number

US12/538,970
Publication Number

US 20100042410A1
Time in Patent Office

1,281 Days
Field of Search

704/260, 704/261, 704/235, 704/258, 704/254, 704/267, 704/266, 704/268, 704/269, 704/247, 704/270, 704/275, 704/276, 704/249, 704/231, 704/246, 704/234
US Class Current

704/260
CPC Class Codes

G10L 13/08   Text analysis or generation...

G10L 13/10   Prosody rules derived from ...

G10L 15/063   Training

G10L 15/1807   using prosody or stress

Training and applying prosody models

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

25 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Training and applying prosody models

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links