System and method for repairing a TTS voice database

US 7,742,919 B1
Filed: 09/27/2005
Issued: 06/22/2010
Est. Priority Date: 09/27/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A computer implemented method of correcting a database associated with the development of a text-to-speech (TTS) voice, the method comprising:

generating via a processor a pronunciation dictionary for use with a TTS voice;

generating via the processor a TTS voice to a stage wherein it is prepared to be tested before being deployed;

receiving a single user input to identify all mislabeled phonetic units associated with the TTS voice at the stage wherein it is prepared to be tested before being deployed;

for each identified mislabeled phonetic unit, linking without additional user input to an entry within the pronunciation dictionary to correct the entry; and

deleting, without additional user input, from the pronunciation dictionary utterances and all associated data for unacceptable utterances.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides various elements of a toolkit used for generating a TTS voice for use in a spoken dialog system. The embodiments in each case may be in the form of the system, a computer-readable medium or a method for generating the TTS voice. One embodiment of the invention relates to a method of correcting a database associated with the development of a text-to-speech (TTS) voice. The method comprises generating a pronunciation dictionary for use with a TTS voice, generating a TTS voice to a stage wherein it is prepared to be tested before being deployed, identifying mislabeled phonetic units associated with the TTS voice, for each identified mislabeled phonetic unit, linking to an entry within the pronunciation dictionary to correct the entry and deleting utterances and all associated data for unacceptable utterances.

Citations

18 Claims

1. A computer implemented method of correcting a database associated with the development of a text-to-speech (TTS) voice, the method comprising:
- generating via a processor a pronunciation dictionary for use with a TTS voice;
  
  generating via the processor a TTS voice to a stage wherein it is prepared to be tested before being deployed;
  
  receiving a single user input to identify all mislabeled phonetic units associated with the TTS voice at the stage wherein it is prepared to be tested before being deployed;
  
  for each identified mislabeled phonetic unit, linking without additional user input to an entry within the pronunciation dictionary to correct the entry; and
  
  deleting, without additional user input, from the pronunciation dictionary utterances and all associated data for unacceptable utterances.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the associated data is at least one of text, audio and labels.
  - 3. The method of claim 1, wherein deleting utterances and all associated data is performed automatically.
  - 4. The method of claim 1, wherein deleting utterances and all associated data occurs further for those utterances that cannot be successfully aligned by automatic speech recognition (ASR).
  - 5. The method of claim 1, further comprising:
    - correcting speaker dependent entries in the pronunciation database; and
      
      rerunning ASR on all utterances containing the offending word.
  - 6. The method of claim 5, wherein a voice-builder module can automatically review only utterances that contain the offending word.

7. A computing device for correcting a database associated with the development of a text-to-speech (TTS) voice, the computing device comprising:
- a processor;
  
  a module configured to control the processor to generate a pronunciation dictionary for use with a TTS voice;
  
  a module configured to control the processor to generate a TTS voice to a stage wherein it is prepared to be tested before being deployed;
  
  a module configured to control the processor to receive a single user input to identify all mislabeled phonetic units associated with the TTS voice at the stage wherein it is prepared to be tested before being deployed;
  
  a module configured to control the processor, for each identified mislabeled phonetic unit, to link to without additional user input an entry within the pronunciation dictionary to correct the entry; and
  
  a module configured to control the processor to delete, without additional user input, from the pronunciation dictionary utterances and all associated data for unacceptable utterances.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computing device of claim 7, wherein the associated data is at least one of text, audio and labels.
  - 9. The computing device of claim 7, wherein deleting utterances and all associated data is performed automatically.
  - 10. The computing device of claim 7, wherein deleting utterances and all associated data occurs further for those utterances that cannot be successfully aligned by automatic speech recognition (ASR).
  - 11. The computing device of claim 7, further comprising:
    - a module configured to control the processor to correct speaker dependent entries in the pronunciation database; and
      
      a module configured to control the processor to rerun ASR on all utterances containing the offending word.
  - 12. The computing device of claim 11, wherein a voice-builder module can automatically review only utterances that contain the offending word.

13. A non-transitory computer-readable storage medium storing instructions for controlling a computing device to correct a database associated with the development of a text-to-speech (TTS) voice, the instructions comprising:
- generating via a processor a pronunciation dictionary for use with a TTS voice;
  
  generating via a processor a TTS voice to a stage wherein it is prepared to be tested before being deployed;
  
  receiving a single user input to identify mislabeled all phonetic units associated with the TTS voice at the stage wherein it is prepared to be tested before being deployed;
  
  for each identified mislabeled phonetic unit, linking without additional user input to an entry within the pronunciation dictionary to correct the entry; and
  
  deleting, without additional user input, from the pronunciation dictionary utterances and all associated data for unacceptable utterances.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The non-transitory computer-readable storage medium of claim 13, wherein the associated data is at least one of text, audio and labels.
  - 15. The non-transitory computer-readable storage medium of claim 13, wherein deleting utterances and all associated data is performed automatically.
  - 16. The non-transitory computer-readable storage medium of claim 13, wherein deleting utterances and all associated data occurs further for those utterances that cannot be successfully aligned by automatic speech recognition (ASR).
  - 17. The non-transitory computer-readable storage medium of claim 13, the instructions further comprising:
    - correcting speaker dependent entries in the pronunciation database; and
      
      rerunning ASR on all utterances containing the offending word.
  - 18. The non-transitory computer-readable storage medium of claim 17, wherein a voice-builder module can automatically review only utterances that contain the offending word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Inc., Cerence Operating Company (Cerence Inc.)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Davis, Steven Lawrence, Schulz, David Eugene, Loney, Louise, Fetters, Shane, Gustafson, Beverly
Primary Examiner(s)
Vo, Huyen X.

Application Number

US11/235,857
Time in Patent Office

1,729 Days
Field of Search

704/220, 704/251, 704/231, 704/235, 704/244, 704/243, 704/258, 704/266, 704/245, 704/246, 704/260, 704/255, 704/261, 704/270
US Class Current

704/260
CPC Class Codes

G10L 13/06 Elementary speech units use...

G10L 15/187 Phonemic context, e.g. pron...

System and method for repairing a TTS voice database

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for repairing a TTS voice database

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links