Method for generating spelling-to-pronunciation decision tree

US 6,230,131 B1
Filed: 04/29/1998
Issued: 05/08/2001
Est. Priority Date: 04/29/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A memory for storing spelling-to-pronunciation data for use in analyzing an input sequence, comprising:

a decision tree data structure stored in said memory that defines a plurality of internal nodes and a plurality of leaf nodes, said internal nodes adapted for storing yes-no questions and said leaf nodes adapted for storing probability data;

a first plurality of said internal nodes being populated with letter questions about a given letter in an input sequence and its neighboring letters in said input sequence;

a second plurality of said internal nodes being populated with phoneme questions about a given phoneme in said input sequence and its neighboring phonemes in said input sequence;

said leaf nodes being populated with probability data that associates said given letter with a plurality of phoneme pronunciations such that said phoneme questions ultimately result in said phoneme pronunciations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Decision trees are used to store a series of yes-no questions that can be used to convert spelled-word letter sequences into pronunciations. Letter-only trees, having internal nodes populated with questions about letters in the input sequence, generate one or more pronunciations based on probability data stored in the leaf nodes of the tree. The pronunciations may then be improved by processing them using mixed trees which are populated with questions about letters in the sequence and also questions about phonemes associated with those letters. The mixed tree screens out pronunciations that would not occur in natural speech, thereby greatly improving the results of the letter-to-pronunciation transformation.

71 Citations

View as Search Results

14 Claims

1. A memory for storing spelling-to-pronunciation data for use in analyzing an input sequence, comprising:
- a decision tree data structure stored in said memory that defines a plurality of internal nodes and a plurality of leaf nodes, said internal nodes adapted for storing yes-no questions and said leaf nodes adapted for storing probability data;
  
  a first plurality of said internal nodes being populated with letter questions about a given letter in an input sequence and its neighboring letters in said input sequence;
  
  a second plurality of said internal nodes being populated with phoneme questions about a given phoneme in said input sequence and its neighboring phonemes in said input sequence;
  
  said leaf nodes being populated with probability data that associates said given letter with a plurality of phoneme pronunciations such that said phoneme questions ultimately result in said phoneme pronunciations.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The memory of claim 1 further comprising a plurality of said decision tree data structures each being associated with a different one of a plurality of letters.
  - 3. The memory of claim 1 wherein said internal nodes are populated based on a predetermined set of training data that includes a plurality of spelled words with associated phoneme pronunciations.
  - 4. The memory of claim 1 wherein said leaf nodes are populated based on a predetermined set of training data that includes a plurality of spelled words with associated phoneme pronunciations.
  - 5. The memory of claim 1 further comprising a dictionary for storing relations between phoneme sequences and words, said dictionary being adapted for coupling to a speech recognizer, and wherein said dictionary is populated at least in part based upon said decision tree.
  - 6. A speech synthesizer incorporating the memory of claim 1 and adapted to receive as input a spelled word defined by a sequences of letters, and wherein said speech synthesizer uses said decision tree to convert at least a portion of said sequences of letters into a phonetic transcription for speech synthesis.

7. A method for processing spelling-to-pronunciation data, comprising the steps of:
- providing a first set of yes-no questions about letters in an input sequence and their relationship to neighboring letters in said input sequence;
  
  providing a second set of yes-no questions about phonemes in said input sequence and their relationship to neighboring phonemes in said input sequence;
  
  providing a corpus of training data representing a plurality of different sets of pairs each pair containing a letter sequence and a phoneme sequence, said letter sequence selected from an alphabet;
  
  using said first and second sets and said training data to generate decision trees for at least a portion of said alphabet, said decision trees each having a plurality of internal nodes and a plurality of leaf nodes;
  
  populating said internal nodes with questions selected from said first and second sets; and
  
  populating said leaf nodes with the probability data that associates said portion of said alphabet with a plurality of phoneme pronunciations based on said training data, such that said phoneme pronunciations result from internal nodes populated with questions selected from both said first and second sets.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
- - 8. The method of claim 7 further comprising providing said corpus of training data as aligned letter sequence-phoneme sequence pairs.
  - 9. The method of claim 7 wherein said step of providing a corpus of training data further comprises providing a plurality of input sequences containing sequences of phonemes representing pronunciation of words formed by said sequences of letters;
    - and aligning selected ones of said phonemes with selected ones of said letters to define aligned letter-phoneme pairs.
  - 10. The method of claim 7 further comprising supplying an input string of letters with at least one associated phoneme pronunciation and using said decision trees to score said pronunciation based on said probability data.
  - 11. The method of claim 7 further comprising supplying an input string of letters with a plurality of associated phoneme pronunciations and using said decision trees to select one of said plurality of pronunciation based on said probability data.
  - 12. The method of claim 7 further comprising supplying an input string of letters representing a word with a plurality of associated phoneme pronunciations and using said decision trees to generate a phonetic transcription of said word based on said probability data.
  - 13. The method of claim 12 further comprising using said phonetic transcription to populate a dictionary associated with a speech recognizer.
  - 14. The method of claim 7 further comprising supplying an input string of letters representing a word with a plurality of associated phoneme pronunciations and using said decision trees to assign a numerical score to each one of said plurality of pronunciations.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Junqua, Jean-Claude, Contolini, Matteo, Kuhn, Roland
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Armstrong, Angela

Application Number

US09/069,308
Time in Patent Office

1,105 Days
Field of Search

704/10, 704/243, 704/245, 704/254, 704/255, 704/266, 704/260, 704/267, 707/100, 707/102
US Class Current

704/266
CPC Class Codes

G10L 13/08 Text analysis or generation...

Method for generating spelling-to-pronunciation decision tree

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

71 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method for generating spelling-to-pronunciation decision tree

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

71 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links