HMM-based text-to-phoneme parser and method for training same

US 20030088416A1
Filed: 11/06/2001
Published: 05/08/2003
Est. Priority Date: 11/06/2001
Status: Abandoned Application

First Claim

Patent Images

1. A method for training a text-to-phoneme parser system, comprising:

generating first information based on pronunciations within a phonetic dictionary, said first information identifying a plurality of potential diphones;

pruning said plurality of potential diphones based on frequency of occurrence information to produce pruned diphones;

forming an extended set of phonemes that includes said pruned diphones as legal phonemes; and

generating second information, based on said extended set of phonemes, for use in performing text-to-phoneme parsing.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An HMM-based text-to-phoneme parser uses probability information within a probability database to generate one or more phoneme strings for a written input word. Techniques for training the text-to-phoneme parser are provided.

Citations

35 Claims

1. A method for training a text-to-phoneme parser system, comprising:
- generating first information based on pronunciations within a phonetic dictionary, said first information identifying a plurality of potential diphones;
  
  pruning said plurality of potential diphones based on frequency of occurrence information to produce pruned diphones;
  
  forming an extended set of phonemes that includes said pruned diphones as legal phonemes; and
  
  generating second information, based on said extended set of phonemes, for use in performing text-to-phoneme parsing.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein:
    - said first information includes diphone emission information.
  - 3. The method of claim 1, wherein:
    - said first information includes phoneme emission information.
  - 4. The method of claim 1, wherein:
    - generating first information includes performing supervised segmentation of words within said phonetic dictionary.
  - 5. The method of claim 4, wherein:
    - performing supervised segmentation includes performing a Viterbi search to identify an optimal segmentation for a first word based on a set of phonemes identified for said first word within said phonetic dictionary.
  - 6. The method of claim 1, wherein:
    - generating first information includes performing cycles of supervised segmentation and probability generation for words within said phonetic dictionary.
  - 7. The method of claim 1, wherein:
    - pruning said plurality of potential diphones includes selecting diphones from said plurality of potential diphones that have a highest number of occurrences.
  - 8. The method of claim 1, wherein:
    - said phonetic dictionary identifies an initial set of phonemes; and
      
      forming an extended set of phonemes includes adding said pruned diphones to said initial set of phonemes.
  - 9. The method of claim 1, wherein:
    - generating second information includes generating phoneme emission probabilities for phonemes within said extended set of phonemes.
  - 10. The method of claim 1, wherein:
    - generating second information includes generating phoneme transition probabilities for phonemes within said extended set of phonemes.
  - 11. The method of claim 1, wherein:
    - generating second information includes generating a probability that a specific letter string will be induced given a present phoneme and a previous phoneme.
  - 12. The method of claim 1, wherein:
    - generating second information includes generating a probability that a specific phoneme will be induced given a previous phoneme and a letter string emitted by said previous phoneme.
  - 13. The method of claim 1, wherein:
    - generating second information includes performing supervised segmentation of words within said phonetic dictionary.
  - 14. The method of claim 1, wherein:
    - generating second information includes performing cycles of supervised segmentation and probability generation for words within said phonetic dictionary.

15. A method for use in training a text-to-phoneme parser system, comprising:
- segmenting words based on known word pronunciations to generate segmentation results;
  
  generating probability information using said segmentations results, said probability information including a plurality of probability values;
  
  identifying probability values within said probability information that are below a first threshold value; and
  
  changing said identified probability values to a predetermined value.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
- - 16. The method of claim 15, wherein:
    - said predetermined value is said first threshold value.
  - 17. The method of claim 15, further comprising:
    - re-segmenting said words, after changing said identified probability values, based on said probability information to generate new segmentation results.
  - 18. The method of claim 17, further comprising:
    - generating new probability information using said new segmentations results, said new probability information including a plurality of probability values;
      
      detecting probability values within said new probability information that are below a second threshold value; and
      
      changing said detected probability values to a second predetermined value.
  - 19. The method of claim 18, wherein:
    - said second threshold value is less than said first threshold value.
  - 20. The method of claim 15, wherein:
    - said probability information includes phoneme emission probabilities.
  - 21. The method of claim 15, wherein:
    - said probability information includes a probability that a specific letter string will be induced given a present phoneme and a previous phoneme.
  - 22. The method of claim 15, wherein:
    - said probability information includes diphone emission probabilities, said diphone emission probabilities including a probability that a specific letter will be emitted by a given phoneme pair.
  - 23. The method of claim 15, wherein:
    - said probability information includes phoneme transition probabilities.
  - 24. The method of claim 23, wherein:
    - said phoneme transition probabilities include a probability that a specific phoneme will be induced given a previous phoneme.
  - 25. The method of claim 23, wherein:
    - said phoneme transition probabilities include a probability that a specific phoneme will be induced given a previous phoneme and a letter string emitted by said previous phoneme.
  - 26. The method of claim 23, wherein:
    - segmenting words includes segmenting words based on corresponding pronunciations within a phonetic dictionary.

27. A method for use in training a text-to-phoneme parser system, comprising:
- segmenting words based on known word pronunciations to generate segmentation results; and
  
  generating probability information using said segmentation results, said probability information including generalized transition probability information, said generalized transition probability information including a probability that a specific phoneme will be induced given a previous phoneme and a letter string emitted by said previous phoneme.
- View Dependent Claims (28, 29, 30, 31)
- - 28. The method of claim 27, wherein:
    - said probability information includes generalized emission probability information, said generalized emission probability information including a probability that a specific letter string will be induced given a present phoneme and a previous phoneme.
  - 29. The method of claim 27, wherein:
    - segmenting words includes segmenting words based on corresponding pronunciations within a phonetic dictionary.
  - 30. The method of claim 27, wherein:
    - segmenting words includes identifying a best path through a Viterbi search table for a first word.
  - 31. The method of claim 27, further comprising:
    - repeating segmenting words and generating probability information until a predetermined condition has been satisfied.

32. A text-to-phoneme parsing system, comprising:
- a probability database including generalized transition probability information, said generalized transition probability information including a probability that a specific phoneme will occur given a previous phoneme and a letter string emitted by said previous phoneme, and a text-to-phoneme parser to generate at least one phoneme string for a written input word based on information within said probability database.
- View Dependent Claims (33, 34, 35)
- - 33. The text-to-phoneme parsing system of claim 32, wherein:
    - said probability database includes generalized emission probability information, said generalized emission probability information including a probability that a specific letter string will be induced given a present phoneme and a previous phoneme.
  - 34. The text-to-phoneme parsing system of claim 32, wherein:
    - said probability database includes probability information that was generated based upon word pronunciations within a phonetic dictionary.
  - 35. The text-to-phoneme parsing system of claim 32, wherein:
    - said text-to-phoneme parser generates the N best phoneme strings for said written input word, where N is an integer greater than 1.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
DSPC Technologies Limited
Inventors
Griniasty, Meir

Application Number

US10/013,239
Publication Number

US 20030088416A1
Time in Patent Office

Days
Field of Search
US Class Current

704/256
CPC Class Codes

G10L 13/08 Text analysis or generation...

G10L 15/144 Training of HMMs

HMM-based text-to-phoneme parser and method for training same

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

35 Claims

Specification

Solutions

Use Cases

Quick Links

HMM-based text-to-phoneme parser and method for training same

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

35 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links