Post processing timing of rhythm in synthetic speech

US 6,029,131 A
Filed: 06/28/1996
Issued: 02/22/2000
Est. Priority Date: 06/28/1996
Status: Expired due to Term

First Claim

Patent Images

1. A synthetic speech system comprising:

means for detecting natural timing boundaries in words to be spoken by said synthetic speech system, to produce natural timing intervals;

means for identifying phonemes in said natural timing intervals;

means for assigning first time durations for each of said phonemes;

means for changing a selected first time duration of a selected phoneme to achieve a desired time duration for a selected natural timing interval containing said selected phoneme; and

means for setting a plurality of said natural timing intervals to substantially the same second time duration, a particular phoneme having a computed time duration in response to number of phonemes within said selected natural timing interval and said second time duration;

wherein at least said selected first time duration is based upon an elasticity parameter indicative of degree to which said selected first time duration may be adjusted without undesirably degrading speech produced by said system.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for generating synthetic speech uses detection of natural timing boundaries in words to be spoken by the synthetic speech system, to produce natural timing intervals. Phonemes are identified in the natural timing intervals. Time durations are assigned for each of the phonemes. A time duration of a selected phoneme is changed to achieve a desired time duration for a selected natural timing interval containing the phoneme. The natural timing interval may be selected to be a syllable. The natural timing interval may be selected to be the interval between two stressed phonemes. The natural timing intervals may be set to substantially the same duration between timing boundaries by changing the phoneme durations in accordance with rhythm of the language of synthesized speech. Durations of preselected phonemes, however, may remain unchanged.

Citations

17 Claims

1. A synthetic speech system comprising:
- means for detecting natural timing boundaries in words to be spoken by said synthetic speech system, to produce natural timing intervals;
  
  means for identifying phonemes in said natural timing intervals;
  
  means for assigning first time durations for each of said phonemes;
  
  means for changing a selected first time duration of a selected phoneme to achieve a desired time duration for a selected natural timing interval containing said selected phoneme; and
  
  means for setting a plurality of said natural timing intervals to substantially the same second time duration, a particular phoneme having a computed time duration in response to number of phonemes within said selected natural timing interval and said second time duration;
  
  wherein at least said selected first time duration is based upon an elasticity parameter indicative of degree to which said selected first time duration may be adjusted without undesirably degrading speech produced by said system.
- View Dependent Claims (2, 3)
- - 2. The system as in claim 1 wherein each natural timing interval is a respective syllable.
  - 3. The system as in claim 1 wherein each natural timing interval is a respective interval between two respective stressed phonemes.

4. A method for generating synthetic speech, comprising;
- detecting natural timing boundaries in words to be spoken by a synthetic speech system, to produce natural timing intervals;
  
  identifying phonemes in said natural timing intervals;
  
  assigning first time durations for each of said phonemes;
  
  changing a selected first time duration of a selected phoneme to achieve a desired time duration for a selected natural timing interval containing said selected phoneme; and
  
  setting a plurality of said natural timing intervals to substantially the same second time duration, a particular phoneme having a computed time duration in response to number of phonemes within said selected natural timing interval and said second time durations;
  
  wherein at least said selected first time duration is based upon a predetermined parameter indicative of degree to which said selected first time duration may be adjusted without undesirably degrading speech produced by said system.
- View Dependent Claims (5, 6)
- - 5. The method of claim 4 further comprising:
    - selecting each natural timing interval to be a respective syllable.
  - 6. The method of claim 4 further comprising:
    - selecting each natural timing interval to be a respective interval between two respective stressed phonemes.

7. A synthetic speech system comprising:
- means for storing speech to be synthesized in a computer memory;
  
  means for a processor to read said speech from said computer memory and for said processor to detect natural timing boundaries in words to be spoken by said synthetic speech system, to produce natural timing intervals;
  
  means for identifying phonemes in said natural timing intervals;
  
  means for assigning first time durations for each of said phonemes;
  
  means for changing a selected first time duration of a selected phoneme to achieve a desired time duration for a selected natural timing interval containing said selected phoneme;
  
  means for setting a plurality of said natural timing intervals to substantially the same second time duration, a particular phoneme having a computed time duration in response to number of phonemes within said selected natural timing interval and said second time duration; and
  
  means for applying said synthesized speech to an electromechanical acoustic coupler to make audible speech;
  
  wherein respective time durations of at least certain respective phonemes are based upon respective selectable parameters indicative of respective degrees to which said respective time durations may be adjusted without undesirably degrading speech produced by said system.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The system as in claim 7 wherein each said natural timing interval is a respective syllable.
  - 9. The system as in claim 7 wherein each said natural timing interval is a respective interval between two respective stressed phonemes.
  - 10. The system as in claim 7 wherein said computer memory is a read only memory ROM.
  - 11. The system as in claim 7 wherein said computer memory is a computer disk.

12. A synthetic speech system comprising:
- a computer process for detecting natural timing boundaries in words to be spoken by said synthetic speech system, to produce natural timing intervals, said words stored in a computer memory;
  
  a computer process for identifying phonemes in said natural timing intervals;
  
  a computer process for assigning first time durations for each of said phonemes;
  
  a computer process for changing a selected first time duration of a selected phoneme to achieve a desired time duration for a selected natural timing interval containing said selected phoneme; and
  
  a computer process for setting a plurality of said natural timing intervals to substantially the same second time duration, a particular phoneme having a computed time duration in response to number of phonemes within said selected natural timing interval and said second time duration;
  
  wherein at least one respective time duration of at least one respective phoneme is based upon a selectable parameter indicative of degree to which the at least one respective time duration is adjustable without undesirably degrading speech produced by the system.
- View Dependent Claims (13, 14, 15)
- - 13. The system as in claim 12 further comprising:
    - a computer process for dividing said phonemes into at least two groups, a first group of extensible phonemes and a second group of fixed phonemes, and to adjust respective time durations of said extensible phonemes while not adjusting respective time durations of said fixed phonemes.
  - 14. The system as in claim 12 further comprising:
    - a computer process for adjusting a speaking rate by adjusting the respective time durations of extensible phonemes within each natural timing interval.
  - 15. The system as in claim 12 wherein said system is configured to generate audible speech in at least one of a syllable timed rhythm language and a stress timed rhythm language.

16. A synthetic speech system comprising:
- a computer process for detecting natural timing boundaries in words including syllables to be spoken by said synthetic speech system, to produce natural timing intervals, said words being stored in a computer memory and each of said natural timing intervals involving a respective syllable;
  
  a computer process for identifying phonemes in each syllable, said phonemes including flexible and inflexible phonemes;
  
  a computer process for assigning first time durations for each of said phonemes; and
  
  a computer process for achieving a selected percentage of syllable timed rhythm in synthesized speech by adjusting a respective inherent time interval for each respective flexible phoneme to adjust a respective syllable time duration to be within said selected percentage of a reference duration, said reference duration being computed from a desired speaking rate, and respective time durations of said inflexible phonemes not being adjusted based upon said selected percentage.

17. A synthetic speech system comprising;
- a computer process for detecting natural timing boundaries in words including syllables to be spoken by said synthetic speech system, to produce natural timing intervals, said words being stored in a computer memory, and each natural timing interval involving a respective stressed time interval between respective stressed syllables;
  
  a computer process for identifying phonemes in said stressed time interval, said phonemes including fixed and flexible phonemes;
  
  a computer process for assigning first time durations for each of said phonemes; and
  
  a computer process for achieving a selected percentage of stress timed rhythm in synthesized speech by adjusting an inherent time interval for at least one flexible phoneme in a respective stressed time interval to adjust a respective time duration of said respective stressed time interval to be within said selected percentage of a reference duration, said reference duration being computed from a desired speaking rate, and respective time durations of said fixed phonemes not being adjusted based upon said selected percentage.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Digital Equipment Corporation (HP Inc.)
Inventors
Bruckert, Edward A.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Storm, Donald L.

Application Number

US08/670,856
Time in Patent Office

1,334 Days
Field of Search

395/2.28, 395/2.63, 704/254, 704/2, 704/200, 704/258, 704/260, 704/266, 704/267, 704/268, 704/270, 704/277
US Class Current

704/260
CPC Class Codes

G10L 13/08 Text analysis or generation...

Post processing timing of rhythm in synthetic speech

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Post processing timing of rhythm in synthetic speech

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links