Defining atom units between phone and syllable for TTS systems

US 20060155544A1
Filed: 01/11/2005
Published: 07/13/2006
Est. Priority Date: 01/11/2005
Status: Active Grant

First Claim

Patent Images

1. A method of developing a unit inventory for use by a text to speech system, comprising:

identifying a list of phones for a target language;

receiving a lexicon containing phonetic transcriptions of a plurality of words having a plurality of syllables;

identifying a set of common multi-phone atom units for the lexicon; and

adding the set of common multi-phone atom units to the unit inventory for the target language.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for identifying common multiphone units to add to a unit inventory for a text-to-speech generator is disclosed. The common multiphone units are units that are larger than a phone, but smaller than a syllable. The method slices each syllable into a plurality of slices. These slices are then sorted and the frequency of each slice is determined. Those slices whose frequencies exceed a threshold are added to the unit inventory. The remaining slices are decomposed according to a predetermined set of rules to determine if they contain slices that should be added to the unit inventory.

Citations

19 Claims

1. A method of developing a unit inventory for use by a text to speech system, comprising:
- identifying a list of phones for a target language;
  
  receiving a lexicon containing phonetic transcriptions of a plurality of words having a plurality of syllables;
  
  identifying a set of common multi-phone atom units for the lexicon; and
  
  adding the set of common multi-phone atom units to the unit inventory for the target language.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1 wherein identifying a set of common multi-phone atom units comprises:
    - decomposing each syllable into a plurality of slices;
      
      identifying non-common slices within the plurality of slices; and
      
      decomposing the non-common slices according to a predetermined set of rules.
  - 3. The method of claim 2 wherein identifying the non-common slices within the plurality of slices comprises:
    - sorting the plurality of slices in order of frequency of occurrence;
      
      selecting as the non-common slices those slices in the plurality of slices having a frequency of occurrence in the lexicon below a threshold value.
  - 4. The method of claim 3 wherein the threshold value is 12.
  - 5. The method of claim 3 wherein decomposing the non-common slices comprises:
    - removing at least one phone from the non-common slice to generate a first new slice; and
      
      determining if the first new slice matches one of an existing phone or common multi-phone in the unit inventory.
  - 6. The method of claim 5 wherein if the first new slice does not match with an existing phone or common multi-phone in the unit inventory further executing the steps of:
    - decomposing the first new slice according the predetermined set of rules to generate a second new slice;
      
      determining if the second new slice is the same as the first new slice;
      
      if the second new slice is the same as the first new slice, then;
      
      adding the second new slice to the unit inventory;
      
      if the second new slice is not the same as the first new slice, then;
      
      determining whether the second new slice matches one of the existing phones or common multi-phones in the lexicon; and
      
      if the second new slice does not match one of the existing phones or common multi-phones in the lexicon, then;
      
      repeating the decomposing step.
  - 7. The method of claim 5 further comprising:
    - after removing the phone from the slice, adding the removed phone to a neighboring slice.
  - 8. The method of claim 2 wherein decomposing the syllable into a plurality of slices comprises:
    - breaking the syllable into three slices.
  - 9. The method of claim 8 wherein the three slices represent an onset slice, a nucleus slice and a coda slice, and wherein at least one of the three slices is a multiphone slice that is sized between a phone and a syllable.
  - 10. The method of claim 2 wherein the predetermined rules are based upon phonetic and phonological statistics for the target language.
  - 11. The method of claim 2 wherein if the predetermined rules are unable to decompose the non-common slice, then:
    - adding the slice to the unit inventory.

12. An apparatus for generating speech from text, comprising:
- a unit inventory for storing a set of phoneme based atom units for at least one target speaker;
  
  a text analyzer for obtaining a string of phonetic symbols representative of a text to be converted to speech; and
  
  a concatenation module for selecting stored phoneme-based atom units from the unit inventory based on the context of the phonetic symbols for the text; and
  
  synthesizing the selected phoneme-based atom units to generate speech corresponding to the text.
- View Dependent Claims (13, 14, 15)
- - 13. The apparatus of claim 12 wherein the set of phoneme-based atom unit includes units sized greater than a phone but less than a syllable.
  - 14. The apparatus of claim 13 wherein the set of phoneme-based atom units includes a complete set of monophones for the target language.
  - 15. The apparatus of claim 13 wherein the set of phoneme-based atom units sized between a phone and a syllable are representative of common multiphone units in the target language.

16. A unit inventory for use in text-to-speech generation, comprising:
- a set of monophone units for a target language; and
  
  a set of atom units for the target language.
- View Dependent Claims (17, 18, 19)
- - 17. The unit inventory of claim 16, wherein the set of atom units are sized between a phone and a syllable.
  - 18. The unit inventory of claim 17, wherein the set of atom units comprises atom units that are determined to be common multiphonal units for the target language.
  - 19. The unit inventory of claim 18, wherein the set of atom units includes atom units that are not common to the target language, but were unable to be decomposed according to a predetermined set of rules to match an entry already in the unit inventory.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Zhao, Yong, Chu, Min

Granted Patent

US 7,418,389 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/267
CPC Class Codes

G10L 13/08 Text analysis or generation...

Defining atom units between phone and syllable for TTS systems

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Defining atom units between phone and syllable for TTS systems

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links