Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems

US 20020193994A1
Filed: 03/30/2001
Published: 12/19/2002
Est. Priority Date: 03/30/2001
Status: Active Grant

First Claim

Patent Images

1. A voice adaptation system for use with a text-to-speech synthesizer, comprising:

a recorded snippet database having initial snippets;

a comparison snippets set based on speech from a new speaker;

wherein the comparison snippets are used to provide a comparison with current snippets, the comparison is based on evaluating voice quality; and

new speaker text for adapting the voice quality of the text-to-speech synthesizer, the new speaker text based on the comparison.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A new speaker provides speech from which comparison snippets are extracted. The comparison snippets are compared with initial snippets stored in a recorded snippet database that is associated with a concatenative synthesizer. The comparison of the snippets to the initial snippets produces required sound units. A greedy selection algorithm is performed with the required sound units for identifying the smallest subset of the input text that contains all of the text for the new speaker to read. The new speaker then reads the optimally selected text and sound units are extracted from the human speech such that the recorded snippet database is modified and the speech synthesized adopts the voice quality and characteristics of the new speaker.

Citations

17 Claims

1. A voice adaptation system for use with a text-to-speech synthesizer, comprising:
- a recorded snippet database having initial snippets;
  
  a comparison snippets set based on speech from a new speaker;
  
  wherein the comparison snippets are used to provide a comparison with current snippets, the comparison is based on evaluating voice quality; and
  
  new speaker text for adapting the voice quality of the text-to-speech synthesizer, the new speaker text based on the comparison.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The system of claim 1 wherein the new speaker text is characterized as the smallest subset of text representative of the required sound units.
  - 3. The system of claim 1 wherein the new speaker text is produced by greedy selection.
  - 4. The system of claim 1 wherein the comparison snippet set includes allophones.
  - 5. The system of claim 1 further includes a microphone for inputting new speaker text.

6. A voice adaptation system for use with a text-to speech synthesizer, comprising:
- a recorded snippet database having initial snippets;
  
  a comparison snippet set based on speech from a new speaker;
  
  required sound units for forming new speaker text;
  
  wherein the required sound units are generated from a comparison of the snippet set with the recorded snippet; and
  
  text for adapting the recorded snippet database so that synthesized speech has a voice quality of the new speaker, the text provided by an optimal selection algorithm for selecting a limited amount of text representative of the required sound units.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The system of claim 6 wherein the initial snippets are replaced with extracted snippets obtained from the text.
  - 8. The system of claim 6 wherein the optimal selection algorithm is greedy selection.
  - 9. The system of claim 6 wherein the comparison snippet set includes allophones.
  - 10. The system of claim 6 further includes a microphone for inputting new speaker text

11. A method for adapting the voice quality of a text-to-speech synthesizer having a recorded snippet database, comprising:
- obtaining a comparison snippets set based on speech from a new speaker;
  
  retrieving initial snippets from the recorded snippet database;
  
  providing required sound units for generating text;
  
  wherein the required sound units are based on a comparison of the initial snippets to the comparison snippet set; and
  
  generating text for the new speaker to read, the text is a smallest subset that contains the required sound units.
- View Dependent Claims (12, 13, 14, 15, 17)
- - 12. The method of claim 11 wherein the new speaker text is produced by greedy selection.
  - 13. The method of claim 11 wherein the comparison snippet set includes allophones.
  - 14. The method of claim 11 further includes the steps of:
    - obtaining new speech from the new speaker, the new speech based on the text;
      
      extracting new snippets from the new speech; and
      
      modifying the recorded snippet database with the new snippets.
  - 15. The method of claim 14 wherein the initial snippets are based on text optimally selected to represent sound units.
  - 17. The method of claim 16 further comprising analyzing said plurality of allophones from said portion to construct source-filter model components used to construct said speech synthesizer.

16. A method of constructing a speech synthesizer comprising the steps of:
- obtaining a corpus labeled recorded speech containing a plurality of allophones in a plurality of contexts;
  
  performing greedy selection on said corpus to extract a portion of said plurality of allophones based on contextual information;
  
  using said portion of said plurality of allophones to generate synthesis model components of a speech synthesizer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sovereign Peak Ventures, LLC (Dominion Harbor Enterprises, LLC)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Junqua, Jean-Claude, Hanson, Brian, Pearson, Steven, Kibre, Nicholas

Granted Patent

US 6,792,407 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 13/047 Architecture of speech synt...

Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links