Method and apparatus for speech generation from phonetic codes

US 5,463,715 A
Filed: 12/30/1992
Issued: 10/31/1995
Est. Priority Date: 12/30/1992
Status: Expired due to Fees

First Claim

Patent Images

1. The method of generating speech comprising the steps of:

storing a plurality of digitized waveforms representing phonemes of voiced and fricative types;

assigning an articulation type to each phoneme;

inputting a series of phonetic codes representing speech wherein the series of phonetic codes identify a succession of phonemes;

for each phonetic code, generating an allophone by selecting at least one stored digitized center waveform corresponding to such phonetic code, selecting a stored digitized initial waveform according to the articulation type of the preceding phoneme in the succession, selecting a stored digitized final waveform according to the articulation type of the following phoneme in the succession, and serially combining the selected initial, center, and final waveforms; and

concatenating a series of allophones corresponding to the series of phonetic codes for producing a digital representation of the speech; and

producing audible speech from the digital representation of speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech generation from phonetic code is carried out by a microcomputer based system which stores digitized waveform segments and appropriately joins the segments and outputs them to a digital to analog converter and then to a speaker. An allophone is generated for each phoneme designated by the phonetic codes according to the articulation type of each adjacent phoneme. Each phoneme is classified as neutral, labial, glottal, or medial according to its effect on the articulation of adjacent phonemes. Each phoneme is characterized by at least one center waveform dependent on the phonetic code, and an initial waveform and a final waveform, each of which depend on the phonetic code and the articulation type of the neighboring phoneme. Tables of waveform pointers are accessed according to phonetic code and articulation type, and other tables provide articulation types, times of each waveform portion, transition rate, fricative state, and pitch for each phonetic code. Adjacent waveforms are gradually blended together. Continuously varying center waveforms are afforded by indexing through successive waveform pointers at a given rate during the center phoneme period, the rate and the period being retrieved from the tables.

40 Citations

View as Search Results

29 Claims

1. The method of generating speech comprising the steps of:
- storing a plurality of digitized waveforms representing phonemes of voiced and fricative types;
  
  assigning an articulation type to each phoneme;
  
  inputting a series of phonetic codes representing speech wherein the series of phonetic codes identify a succession of phonemes;
  
  for each phonetic code, generating an allophone by selecting at least one stored digitized center waveform corresponding to such phonetic code, selecting a stored digitized initial waveform according to the articulation type of the preceding phoneme in the succession, selecting a stored digitized final waveform according to the articulation type of the following phoneme in the succession, and serially combining the selected initial, center, and final waveforms; and
  
  concatenating a series of allophones corresponding to the series of phonetic codes for producing a digital representation of the speech; and
  
  producing audible speech from the digital representation of speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method as defined in claim 1 wherein the step of assigning an articulation type to each phoneme comprises identifying each phoneme as labial, glottal, medial or neutral in accordance with each phoneme'"'"'s effect on the articulation of adjacent phonemes.
  - 3. The method as defined in claim 2 wherein the step of assigning an articulation type includes assigning an initial articulation parameter and a final articulation parameter in accordance with each phoneme'"'"'s effect on the articulation of adjacent preceding and following phonemes respectively.
  - 4. The method as defined in claim 1 including the steps of:
    - storing tables of waveform pointers for each phoneme according to articulation type and for final and initial waveforms; and
      
      the steps of selecting each final waveform and each initial waveform comprise looking up the corresponding waveform pointers in accordance with the articulation type and the phonetic code.
  - 5. The method as defined in claim 1 including the steps of:
    - storing tables of waveform pointers for each phoneme according to articulation type and for final and initial waveforms, and tables of waveform duration parameters for each phoneme; and
      
      wherein the step of producing a digital representation of the speech comprises incorporating each selected waveform for a period indicated by the tables of waveform duration parameters.
  - 6. The method as defined in claim 1 wherein given phonemes have a plurality of center waveforms for sequential use and other phonemes have only a single center waveform, including the steps of:
    - storing tables of waveform pointers for center waveforms including for the given phonemes a plurality of pointers in sequential memory locations; and
      
      retrieving a sequence of center waveforms for such given phonemes by indexing through the sequential memory locations to recall the plurality of pointers.
  - 7. The method as defined in claim 6 including the steps of:
    - storing tables of waveform intervals for center waveforms and of center expiration times;
      
      retrieving center waveforms by selecting a corresponding stored waveform;
      
      comparing the waveform interval and the center expiration time; and
      
      indexing to another waveform pointer when the waveform interval elapses prior to the center expiration time.
  - 8. The method as defined in claim 1 including the steps of:
    - storing a transition rate parameter for initial, center and final waveforms for each phonetic code;
      
      determining a filter rate from the transition rate for the waveform being processed;
      
      addlow pass byte filtering the waveforms to smoothly blend successive waveforms.
  - 9. The method as defined in claim 1 including the steps of:
    - storing pitch data for each phonetic code;
      
      further including pitch data as a part of each phonetic code;
      
      deriving a pitch parameter from the stored and the included pitch data; and
      
      repeating each selected waveform at a rate in accordance with the pitch parameter to thereby affect the pitch of the speech.

10. In a speech generation apparatus having a memory containing tables of waveform pointers and tables of phoneme parameters, the method of generating speech comprising the steps of:
- storing in the memory digitized waveforms for use as initial, center and final waveforms;
  
  inputting a phonetic code sequence representing speech;
  
  repetitively and sequentially executing an input routine and an output routine for progressively processing phonetic codes;
  
  on successive passes through the input routine entering branches for selecting from the tables waveform pointers, the waveform pointers identifying initial, center and final waveforms for a current phoneme; and
  
  on successive passes through the output routine, retrieving stored digitized waveforms in accordance with the selected waveform pointers, and generating a digital representation of speech corresponding to the phonetic code sequence by combining the retrieved waveforms.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 11. The method as defined in claim 10 wherein the tables of phoneme parameters include the articulation type of each phoneme, the tables of waveform pointers include pointer addresses for different articulation types, and an input routine includes the steps of:
    - determining from the tables the articulation types of the phonemes preceding and following the current phoneme;
      
      determining from the tables the initial waveform pointer according to the articulation type of the preceding phoneme and the final waveform pointer according to the articulation type of the following phoneme.
  - 12. The method as defined in claim 10 wherein every pass through an input and a subsequent output routine requires the same total execution time, and wherein the output routine includes an adjustable time delay, the method including the step of:
    - adjusting the execution time by selecting a time delay in the output routine, whereby data sampling rate is adjusted.
  - 13. The method as defined in claim 10 comprising the steps of:
    - counting the passes through both routines and indexing a speed counter for each counted pass;
      
      setting a speed counter value and resetting the speed counter each time the number of counted passes equals the set speed counter value;
      
      indexing a phoneme counter each time the speed counter resets;
      
      entering each input routine'"'"'s branch only once for each count of the phoneme counter;
      
      entering a time waste path for each pass in which an input routine is not entered, whereby every pass through the input routine requires the same time.
  - 14. The method as defined in claim 10 wherein one of the routines includes a speed counter indexed at each pass and being reset when a first count value is reached, and a phoneme counter indexed each time the speed counter is reset and being reset when a second count is reached, including the step of adjusting speech rate by selecting the first count value.
  - 15. The method as defined in claim 14 including tables based on the phoneme counter of final waveform intitiation times and final expiration times for each phonetic code, and including the steps of:
    - during the time of the center waveform, selecting a waveform pointer corresponding to the current phoneme for retrieving final waveform initiation times and final expiration times;
      
      when the phoneme counter reaches the retrieved final waveform initiation time, selecting a final waveform pointer, and subsequently terminating the final waveform at the final expiration time.
  - 16. The methods as defined in claim 14 including tables of times based on the phoneme counter for initiating the final waveform, for terminating the phoneme for each phonetic code, and of center waveform indexing intervals, and wherein the waveform pointer tables include for certain phonemes a succession of waveforms, including the steps of:
    - during the time of the center waveform, selecting a waveform pointer corresponding to the current phoneme and whenever the indexing interval occurs before the final waveform initiation time, indexing the waveform pointer to a successive pointer; and
      
      when the phoneme counter reaches the final waveform initiation time, selecting a final waveform pointer.
  - 17. The method as defined in claim 10 including the steps of:
    - for each waveform assigning a flag indicating whether the waveform is fricative or voiced;
      
      on successive passes through the output routine, entering either a fricative routine or a voiced routine according to the flag.
  - 18. The method as defined in claim 10 wherein the tables of phoneme parameters include transition rate parameters, including the steps of:
    - entering an input routine for determining a transition rate for each waveform from the tables of phoneme parameters; and
      
      in the output routine, determining filter characteristics from the transition rate for the new waveform and byte filtering adjacent waveforms using the filter characteristics.
  - 19. The method as defined in claim 10 wherein the tables of phoneme parameters include pitch parameters and the input phonetic codes incorporate input pitch parameters including the steps of:
    - entering an input routine for determining an output pitch parameter from the tables of phoneme parameters and the input pitch parameters; and
      
      in the output routine, repetitively retrieving waveforms at a rate determined by the output pitch parameter.
  - 20. The method as defined in claim 10 including the steps of:
    - storing a global pitch parameter; and
      
      in the output routine, repetitively retrieving waveforms at a rate determined by a function of the global pitch parameter.
  - 21. The method as defined in claim 10 wherein every pass through an input and a subsequent output routine is completed in a selected execution time, the method including the step of:
    - adjusting the selected execution time by selecting a software time delay, whereby pass repetition rate is selected to adjust the data sampling rate.
  - 22. The method as defined in claim 10 wherein one of the routines includes a speed counter indexed at each pass and being reset when accumulated counts equal a global speed parameter, including the steps of:
    - storing a global speed parameter; and
      
      adjusting speech rate by selecting the global speed parameter.

23. Apparatus for generating speech in response to input codes including:
- a memory for storing phoneme waveforms and phoneme articulation types;
  
  input means for receiving a string of phonetic codes representing speech;
  
  context sensitive means for generating allophones for respective received phonetic codes including means for selecting center waveforms as dictated by corresponding phonetic codes and for selecting initial and final waveforms for each allophone according to the respective articulation types of preceding and subsequent phonemes;
  
  waveform transition means for blending selected adjacent waveforms of each allophone and consecutive allophones; and
  
  output means responsive to the blended waveforms for producing audible speech corresponding to the input string.
- View Dependent Claims (24, 25, 26)
- - 24. The apparatus as defined in claim 23, wherein the memory stores the waveforms as waveform segments and the memory further stores pitch data for each phonetic code and the phonetic codes carry further pitch data, including:
    - means responsive to the stored pitch data and the further pitch data carried by the code for calculating a pitch parameter; and
      
      the output means includes means responsive to the pitch parameter for governing the repetition rate of waveform segments thereby affecting the pitch.
  - 25. The apparatus as defined in claim 24 wherein the means for calculating a pitch parameter includes low pass filter means operating on the pitch parameter to smooth the transition of pitch parameter values between successive waveforms.
  - 26. The apparatus as defined in claim 23 wherein the memory stores a transition rate for each phonetic code, and wherein:
    - the waveform transition means includes a filter for gradually blending adjacent waveforms; and
      
      means responsive to the stored transition rates for determining filter characteristics for each waveform.

27. Apparatus for generating speech in response to input codes comprising a microcomputer based apparatus including:
- a buffer for holding a string of phonetic codes representing a succession of phonemes for at least a portion of desired speech;
  
  a read only memory (ROM) containing operating code, a plurality of digitized waveforms, a table of articulation types for each phonetic code and addressable by the phonetic code, and tables of waveform pointers addressed by the phonetic codes and articulation types;
  
  pointer means for successively designating each phonetic code, in turn, as a current phonetic code and for each current phonetic code designating a center phoneme, a preceding phoneme and a following phoneme;
  
  means for looking up in the table of articulation types the types of the preceding and the following phonemes;
  
  means for looking up the waveform pointers for the initial, center, and final waveforms for each current phonetic code, in turn using the articulation types and the phonetic code to define a succession of waveforms; and
  
  means for retrieving waveforms identified by the waveform pointers and joining the retrieved waveforms for each phonetic code to generate a string of context sensitive allophones representing the desired speech.
- View Dependent Claims (28, 29)
- - 28. The apparatus as defined in claim 27 wherein the ROM contains a table of fricative states addressed by the phonetic codes;
    - means for generative a fricative phoneme from waveforms associated with a fricative state; and
      
      means for generating a voiced phoneme from waveforms not associated with a fricative state.
  - 29. The apparatus as defined in claim 27 wherein the ROM contains a table of pitch modifiers addressed by the phonetic codes;
    - means for generating a pitch parameter for each phoneme;
      
      low pass filter means for filtering the pitch parameters to prevent sudden changes of parameter value; and
      
      means for controlling waveform repetition rate dependent on the filtered pitch parameters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Innovation Technologies Incorporated (Innovation Technology Group)
Original Assignee
Innovation Technologies (Amundi Asset Management SA (Investment Management))
Inventors
Gagnon, Richard T.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Doerrler, Michelle

Application Number

US07/998,459
Time in Patent Office

1,035 Days
Field of Search

381/51-53, 381/35, 381/43, 395/2.67, 395/2.7-2.78
US Class Current

704/267
CPC Class Codes

G10L 13/07 Concatenation rules

Method and apparatus for speech generation from phonetic codes

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

40 Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for speech generation from phonetic codes

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links