System and method for preparing a pronunciation dictionary for a text-to-speech voice

US 7,630,898 B1
Filed: 09/27/2005
Issued: 12/08/2009
Est. Priority Date: 09/27/2005
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of generating a database for a text-to-speech (TTS) voice, the method comprising:

matching via a processor every spoken word associated with a TTS voice database with a smallest set of possible pronunciations for each word, the smallest set being generated by;

automatically via the processor determining a dialect and linguistic context using linguistic rules;

empirically determining idiosyncratic speaker characteristics; and

determining a subject domain; and

dynamically generating a pronunciation dictionary on a word-by-word basis using the smallest set.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are various elements of a toolkit used for generating a TTS voice for use in a spoken dialog system. The embodiments in each case may be in the form of the system, a computer-readable medium or a method for generating the TTS voice. One embodiment of the invention relates to a method of generating a database for a TTS voice. The method comprises matching every spoken word associated with a TTS voice database with a smallest set of possible pronunciations for each word. The smallest set is generated by automatically determining a dialect and linguistic context using linguistic rules, empirically determining idiosyncratic speaker characteristics and determining a subject domain. The method further comprises dynamically generating a pronunciation dictionary on a word-by-word basis using the smallest set.

46 Citations

View as Search Results

20 Claims

1. A computer-implemented method of generating a database for a text-to-speech (TTS) voice, the method comprising:
- matching via a processor every spoken word associated with a TTS voice database with a smallest set of possible pronunciations for each word, the smallest set being generated by;
  
  automatically via the processor determining a dialect and linguistic context using linguistic rules;
  
  empirically determining idiosyncratic speaker characteristics; and
  
  determining a subject domain; and
  
  dynamically generating a pronunciation dictionary on a word-by-word basis using the smallest set.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 15)
- - 2. The computer-implemented method of claim 1, wherein the dynamically generated pronunciation dictionary accounts for reading errors and idiosyncrasies of speakers.
  - 3. The computer-implemented method of claim 1, further comprising:
    - forcing an automatic speech recognition (ASR) module to choose from a subset of one or more variants of a word when more than one pronunciation variants exists for a given word.
  - 4. The computer-implemented method of claim 1, further comprising:
    - automatically generating phonetic variant pronunciations for the pronunciation dictionary for any given word.
  - 5. The computer-implemented method of claim 4, wherein generating phonetic variant pronunciations is based on a surrounding linguistic context for each word.
  - 6. The computer-implemented method of claim 1, wherein the linguistic contexts are associated with foreign languages.
  - 7. The computer-implemented method of claim 1, further comprising:
    - tracking whether each lexical pronunciation is either machine generated or human entered.
  - 8. The computer-implemented method of claim 7, further comprising:
    - flagging machine generated lexical pronunciations for human inspection.
  - 9. The computer-implemented method of claim 1, further comprising adding default pronunciations to the pronunciation dictionary based on TTS letter-to-sound rules.
  - 15. The computing device of claim 9, wherein the linguistic contexts are associated with foreign languages.

10. A computing device for generating a database for a text-to-speech (TTS) voice, the computing device comprising:
- a module configured to control the processor to match every spoken word associated with a TTS voice database with a smallest set of possible pronunciations for each word, the smallest set being generated by;
  
  automatically via the processor determining a dialect and linguistic context using linguistic rules;
  
  empirically determining idiosyncratic speaker characteristics; and
  
  determining a subject domain; and
  
  a module configured to control the processor to dynamically generate a pronunciation dictionary on a word-by-word basis using the smallest set.
- View Dependent Claims (11, 12, 13, 14, 16, 17)
- - 11. The computing device of claim 10, wherein the dynamically generated pronunciation dictionary accounts for reading errors and idiosyncrasies of speakers.
  - 12. The computing device of claim 10, further comprising:
    - a module configured to force an automatic speech recognition (ASR) module to choose from a subset of one or more variants of a word when more than one pronunciation variants exist for a given word.
  - 13. The computing device of claim 10, further comprising:
    - a module configured to automatically generate phonetic variant pronunciations for the pronunciation dictionary for any given word.
  - 14. The computing device of claim 13, wherein generating phonetic variant pronunciations is based on a surrounding linguistic context for each word.
  - 16. The computing device of claim 10, further comprising:
    - a module configured to track whether each lexical pronunciation is either machine generated or human entered; and
      
      a module configured to flag machine generated lexical pronunciations for human inspection.
  - 17. The computing device of claim 10, further comprising a module configured to add default pronunciations to the pronunciation dictionary based on TTS letter-to-sound rules.

18. A tangible computer-readable storage medium storing instructions for controlling a computing device for generating a database for a text-to-speech (TTS) voice, the instructions comprising:
- matching via a processor every spoken word associated with a TTS voice database with a smallest set of possible pronunciations for each word, the smallest set being generated by;
  
  automatically determining a dialect and linguistic context using linguistic rules;
  
  empirically determining idiosyncratic speaker characteristics; and
  
  determining a subject domain; and
  
  dynamically generating a pronunciation dictionary on a word-by-word basis using the smallest set.
- View Dependent Claims (19, 20)
- - 19. The computer-readable medium of claim 18, wherein the dynamically generated pronunciation dictionary accounts for reading errors and idiosyncrasies of speakers.
  - 20. The computer-readable medium of claim 19, the instructions further comprising:
    - forcing an automatic speech recognition (ASR) module to choose from a subset of one or more variants of a word when more than one pronunciation variants exists for a given word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Inc., Cerence Operating Company (Cerence Inc.)
Original Assignee
AT&T Intellectual Property II LP (AT&T, Inc.)
Inventors
Davis, Steven Lawrence, Fetters, Shane, Schulz, David Eugene, Gustafson, Beverly, Loney, Louise
Primary Examiner(s)
Hudspeth, David R
Assistant Examiner(s)
ALBERTALLI, BRIAN LOUIS

Application Number

US11/235,817
Time in Patent Office

1,533 Days
Field of Search

None
US Class Current

704/266
CPC Class Codes

G06F 40/242   Dictionaries

G10L 13/08   Text analysis or generation...

G10L 15/187   Phonemic context, e.g. pron...

System and method for preparing a pronunciation dictionary for a text-to-speech voice

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

46 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for preparing a pronunciation dictionary for a text-to-speech voice

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links