TECHNIQUE FOR TRAINING A PHONETIC DECISION TREE WITH LIMITED PHONETIC EXCEPTIONAL TERMS

US 20080319753A1
Filed: 06/25/2007
Published: 12/25/2008
Est. Priority Date: 06/25/2007
Status: Active Grant

First Claim

Patent Images

1. A semi-automated method for generating a phonetic decision tree with limited phonetic exceptions for a text-to-speech system comprising:

selecting an initial subset of a set of input data;

creating an initial phonetic decision tree from the selected subset;

incorporating a predetermined set of terms from the input data to the selected subset;

testing the phonetic decision tree with the increased subset, wherein each term of the subset is phonetized using the phonetic decision tree;

categorizing a result of the testing step into a set of correctly phonetized terms and a set of incorrectly phonetized terms;

generating an exception-limited phonetic decision tree with the set of correctly phonetized terms;

determining if one or more termination conditions are satisfied; and

when the one or more termination conditions are unsatisfactorily met, automatically repeating the incorporating, testing, categorizing, generating, and determining steps.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention discloses a method for training an exception-limited phonetic decision tree. An initial subset of data can be selected and used for creating an initial phonetic decision tree. Additional terms can then be incorporated into the subset. The enlarged subset can be used to evaluate the phonetic decision tree with the results being categorized as either correctly or incorrectly phonetized. An exception-limited phonetic tree can be generated from the set of correctly phonetized terms. If the termination conditions for the method have been determined to be unsatisfactorily met, then steps of the method can be repeated.

231 Citations

20 Claims

1. A semi-automated method for generating a phonetic decision tree with limited phonetic exceptions for a text-to-speech system comprising:
- selecting an initial subset of a set of input data;
  
  creating an initial phonetic decision tree from the selected subset;
  
  incorporating a predetermined set of terms from the input data to the selected subset;
  
  testing the phonetic decision tree with the increased subset, wherein each term of the subset is phonetized using the phonetic decision tree;
  
  categorizing a result of the testing step into a set of correctly phonetized terms and a set of incorrectly phonetized terms;
  
  generating an exception-limited phonetic decision tree with the set of correctly phonetized terms;
  
  determining if one or more termination conditions are satisfied; and
  
  when the one or more termination conditions are unsatisfactorily met, automatically repeating the incorporating, testing, categorizing, generating, and determining steps.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein the categorizing step further comprises:
    - generating a speech output corresponding to a term in the increased subset;
      
      comparing the generated speech output against a standard pronunciation of the term;
      
      when the generated speech output is equivalent to the standard pronunciation, classifying the term as correctly phonetized; and
      
      when the generated speech output is unequal to the standard pronunciation, classifying the term as incorrectly phonetized.
  - 3. The method of claim 2, wherein the steps of claim 2 are repeated for each term contained within the subset.
  - 4. The method of claim 1, wherein the predetermined set of terms used in the incorporating step is designated within a training interface.
  - 5. The method of claim 1, wherein the steps of claim 1 are performed in a development environment of a text-to-speech (TTS) system.
  - 6. The method of claim 1, wherein the exception-limited phonetic tree is transferred to a runtime environment of the TTS system.
  - 7. The method of claim 1, wherein the testing and categorizing steps are performed by a training engine contained within a development environment of a TTS system.
  - 8. The method of claim 1, further comprising:
    - creating an exception dictionary from the set of incorrectly phonetized terms, wherein the exception dictionary is used by a speech synthesis engine in a runtime environment of a TTS system.
  - 9. The method of claim 8, wherein the speech synthesis engine utilizes the exception dictionary to phonetizer words identified as containing a phonetic exception within an input text string.
  - 10. The method of claim 1, further comprising prior to executing the repeating step:
    - removing the terms contained within the set of incorrectly phonetized words from the subset.
  - 11. The method of claim 1, wherein said steps of claim 1 are performed by at least one machine in accordance with at least one computer program stored in a computer readable media, said computer programming having a plurality of code sections that are executable by the at least one machine.

12. A system for generating a phonetic decision tree with limited exceptions for text-to-speech processing comprising:
- a training data set containing terms for evaluating a phonetic decision tree;
  
  a training engine configured to evaluate the phonetic decision tree using the training data set and a set of standard pronunciations, wherein the training engine categorizes the training data set into a set of correctly phonetized terms and a set of incorrectly phonetized terms; and
  
  a phonetic tree generation engine configured to create an exception-limited phonetic decision tree from the set of correctly phonetized terms.
- View Dependent Claims (13, 14)
- - 13. The system of claim 12, wherein the training engine further comprises:
    - a training interface configured to provide user-configuration of the training data set and one or more termination conditions.
  - 14. The system of claim 12, further comprising:
    - an exception dictionary containing the set of incorrectly phonetized terms.

15. A method for creating a phonetic tree for speech synthesis comprising:
- generating an initial phonetic tree from a training data set of words and corresponding word pronunciations;
  
  converting each word in the data set using the phonetic tree;
  
  comparing a text-to-speech converted word against a corresponding word pronunciation from the data set;
  
  removing from the training data set those words that were not correctly text-to-speech converted using the phonetic tree; and
  
  creating a new phonetic tree using the modified training data set resulting from the removing step, wherein the new phonetic tree is at least one of an intermediate tree used to produce a production tree and a production tree, wherein a production tree is a phonetic tree used by a speech synthesis engine to generate speech output from text input in a runtime environment.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The method of claim 15, further comprising:
    - converting each word in the modified training data set using the new phonetic tree;
      
      determining whether a termination condition has been reached, wherein the termination condition is based at least in part upon a number of words that were incorrectly text-to-speech converted by the new phonetic tree;
      
      when the termination condition has been reached, the new phonetic tree is a production tree; and
      
      when the termination condition has not been reached, the comparing and removing steps are repeated to generate a different phonetic tree that is also tested by the determining step and wherein the steps of claim 16 are repeated until a production tree is created.
  - 17. The method of claim 15, further comprising:
    - establishing a frequency list of words in a language sorted by frequency of use, wherein the training set is created from N percentage of words in the frequency list, wherein N is a configurable percentage.
  - 18. The method of claim 15, further comprising:
    - creating an exception dictionary of words that is used at runtime by the speech synthesis engine in conjunction with the production tree.
  - 19. The method of claim 18, said creating step further comprising:
    - utilizing a set of words removed from the training data set by the removing step when creating the exception dictionary.
  - 20. The method of claim 15, wherein said steps of claim 15 are steps performed automatically by at least one machine in accordance with at least one computer program having a plurality of code sections that are executable by the at least one machine, said at least one computer program being stored in a machine readable medium.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
International Business Machines Corporation
Inventors
HANCOCK, Steven M.

Granted Patent

US 8,027,834 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/04 Details of speech synthesis...

TECHNIQUE FOR TRAINING A PHONETIC DECISION TREE WITH LIMITED PHONETIC EXCEPTIONAL TERMS

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

231 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

TECHNIQUE FOR TRAINING A PHONETIC DECISION TREE WITH LIMITED PHONETIC EXCEPTIONAL TERMS

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

231 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others