Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word

US 6,016,471 A
Filed: 04/29/1998
Issued: 01/18/2000
Est. Priority Date: 04/29/1998
Status: Expired due to Fees

First Claim

Patent Images

1. An apparatus for generating at least one phonetic pronunciation for an input sequence of letters selected from a predetermined alphabet, comprising:

a memory for storing a plurality of letter-only decision trees corresponding to said alphabet,said letter-only decision trees having internal nodes representing yes-no questions about a given letter and its neighboring letters in a given sequence;

said memory further storing a plurality of mixed decision trees corresponding to said alphabet,said mixed decision trees having a first plurality of internal nodes representing yes-no questions about a given letter and its neighboring letters in said given sequence and having a second plurality of internal nodes representing yes-no questions about a phoneme and its neighboring phonemes in said given sequence,said letter-only decision trees and said mixed decision trees further having leaf nodes representing probability data that associates said given letter with a plurality of phoneme pronunciations;

a phoneme sequence generator coupled to said letter-only decision tree for processing an input sequence of letters and generating a first set of phonetic pronunciations corresponding to said input sequence of letters;

a score estimator coupled to said mixed decision tree for processing said first set to generate a second set of scored phonetic pronunciations, the scored phonetic pronunciations representing at least one phonetic pronunciation of said input sequence.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The mixed decision tree includes a network of yes-no questions about adjacent letters in a spelled word sequence and also about adjacent phonemes in the phoneme sequence corresponding to the spelled word sequence. Leaf nodes of the mixed decision tree provide information about which phonetic transcriptions are most probable. Using the mixed trees, scores are developed for each of a plurality of possible pronunciations, and these scores can be used to select the best pronunciation as well as to rank pronunciations in order of probability. The pronunciations generated by the system can be used in speech synthesis and speech recognition applications as well as lexicography applications.

292 Citations

13 Claims

1. An apparatus for generating at least one phonetic pronunciation for an input sequence of letters selected from a predetermined alphabet, comprising:
- a memory for storing a plurality of letter-only decision trees corresponding to said alphabet,said letter-only decision trees having internal nodes representing yes-no questions about a given letter and its neighboring letters in a given sequence;
  
  said memory further storing a plurality of mixed decision trees corresponding to said alphabet,said mixed decision trees having a first plurality of internal nodes representing yes-no questions about a given letter and its neighboring letters in said given sequence and having a second plurality of internal nodes representing yes-no questions about a phoneme and its neighboring phonemes in said given sequence,said letter-only decision trees and said mixed decision trees further having leaf nodes representing probability data that associates said given letter with a plurality of phoneme pronunciations;
  
  a phoneme sequence generator coupled to said letter-only decision tree for processing an input sequence of letters and generating a first set of phonetic pronunciations corresponding to said input sequence of letters;
  
  a score estimator coupled to said mixed decision tree for processing said first set to generate a second set of scored phonetic pronunciations, the scored phonetic pronunciations representing at least one phonetic pronunciation of said input sequence.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The apparatus of claim 1 wherein said second set comprises a plurality of pronunciations each with an associated score derived from said probability data and further comprising a pronunciation selector receptive of said second set and operable to select one pronunciation from said second set based on said associated score.
  - 3. The apparatus of claim 1 wherein said phoneme sequence generator produces a predetermined number of different pronunciations corresponding to a given input sequence.
  - 4. The apparatus of claim 1 wherein said phoneme sequence generator produces a predetermined number of different pronunciations corresponding to a given input sequence and representing the n-best pronunciations according to said probability data.
  - 5. The apparatus of claim 4 wherein said score estimator rescores said n-best pronunciations based on said mixed decision trees.
  - 6. The apparatus of claim 1 wherein said sequence generator constructs a matrix of possible phoneme combinations representing different pronunciations.
  - 7. The apparatus of claim 6 wherein sequence generator selects the n-best phoneme combinations from said matrix using dynamic programming.
  - 8. The apparatus of claim 6 wherein sequence generator selects the n-best phoneme combinations from said matrix by iterative substitution.
  - 9. The apparatus of claim 1 further comprising a speech recognition system having a pronunciation dictionary used for recognizer training and wherein at least a portion of said second set populates said dictionary to supply pronunciations for words based on their spelling.
  - 10. The apparatus of claim 1 further comprising a speech synthesis system receptive of at least a portion of said second set for generating an audible synthesized pronunciation of words based on their spelling.
  - 11. The apparatus of claim 10 wherein said speech synthesis system is incorporated into an e-mail reader.
  - 12. The apparatus of claim 10 wherein said speech synthesis system is incorporated into a dictionary for providing a list of possible pronunciations in order of probability.
  - 13. The apparatus of claim 1 further comprising a language learning system that displays a spelled word and analyzes a speaker'"'"'s attempt at pronouncing that word using at least one of said letter-only decision tree and said mixed decision tree to tell the speaker how probable his or her pronunciation was for that word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Junqua, Jean-Claude, Contolini, Matteo, Kuhn, Roland
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US09/067,764
Time in Patent Office

629 Days
Field of Search

704/266, 704/267, 704/270
US Class Current

704/266
CPC Class Codes

G10L 13/08 Text analysis or generation...

Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

292 Citations

13 Claims

Specification

Use Cases

Quick Links

Others

Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

292 Citations

13 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others