Speech synthesizing method achieved by the segmentation of the linear Formant transition region

US 5,649,058 A
Filed: 05/02/1994
Issued: 07/15/1997
Est. Priority Date: 03/31/1990
Status: Expired due to Fees

First Claim

Patent Images

1. A method for synthesizing speech through a synthesizer system including a personal computer (PC), a PC interface, a speech synthesizer, a digital-to-analog (D/A) converter, a key-board, a memory, and a speaker, the method comprising the steps of:

(a) segmenting linear Formant information, corresponding to phoneme information, into linear Formant transition region segments;

(b) storing Formant frequency information and Formant bandwidth information for points of transition between consecutive ones of the linear Formant transition region segments of step (a), and lengths of the linear Formant transition region segments established by the segmenting in step (a), into a data base in a memory, for each phoneme information;

(c) inputting information subsequent to the storing in step (b), the input information designating speech sound to be synthesized;

(d) reading out stored Formant frequency information, Formant bandwidth information and length of the linear Formant transition region segments corresponding to the input information of step (c), from the data base stored in the memory;

(e) calculating a digital Formant contour, by linearly interpolating between the read out Formant frequency information and Formant bandwidth information corresponding to first and second consecutive points of transition corresponding to one of the linear Formant transition region segments of step (d), the interpolating being calculated over the read out length of the first linear Formant transition region segment;

(f) filtering the digital Formant contour, through a plurality of bandpass filters classified by a characteristic Formant, to produce a digital speech signal representative of a filtered glottal pulse; and

(g) converting the digital speech signal representative of the filtered glottal pulse into an analog speech signal through the D/A converter and outputting the analog speech signal.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A way of a synthesizing speech by the combination of a Speech coding mode and Formant analysis mode is achieved by segmenting a Formant transition region into portions, according to the linear characteristics of a frequency curve, and storing the Formant information of each portion. Therefrom frequency information of a sound is obtained. Formant information data of a Formant contour to produce speech, is calculated by a linear interpolation method. The frequency and the bandwidth, which are elements of the Formant contour calculated by a linear interpolation method, are sequentially filtered in order to produce a speech signal which is a digital speech signal. The digital speech signal is converted to an analog signal, amplified, and output through a external speaker.

11 Citations

View as Search Results

16 Claims

1. A method for synthesizing speech through a synthesizer system including a personal computer (PC), a PC interface, a speech synthesizer, a digital-to-analog (D/A) converter, a key-board, a memory, and a speaker, the method comprising the steps of:
- (a) segmenting linear Formant information, corresponding to phoneme information, into linear Formant transition region segments;
  
  (b) storing Formant frequency information and Formant bandwidth information for points of transition between consecutive ones of the linear Formant transition region segments of step (a), and lengths of the linear Formant transition region segments established by the segmenting in step (a), into a data base in a memory, for each phoneme information;
  
  (c) inputting information subsequent to the storing in step (b), the input information designating speech sound to be synthesized;
  
  (d) reading out stored Formant frequency information, Formant bandwidth information and length of the linear Formant transition region segments corresponding to the input information of step (c), from the data base stored in the memory;
  
  (e) calculating a digital Formant contour, by linearly interpolating between the read out Formant frequency information and Formant bandwidth information corresponding to first and second consecutive points of transition corresponding to one of the linear Formant transition region segments of step (d), the interpolating being calculated over the read out length of the first linear Formant transition region segment;
  
  (f) filtering the digital Formant contour, through a plurality of bandpass filters classified by a characteristic Formant, to produce a digital speech signal representative of a filtered glottal pulse; and
  
  (g) converting the digital speech signal representative of the filtered glottal pulse into an analog speech signal through the D/A converter and outputting the analog speech signal.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1, wherein the calculation of step (e) includes the steps of:
    - (e) (00) determining a number of samples to be calculated between the read out Formant frequency information of the first and second linear Formant transition region segments, and between the read out Formant bandwidth information of the first and second linear Formant transition region segments;
      
      (e) (0) assigning a sample index value to designate a first one of the samples, and making a first linear interpolation calculation for the first sample;
      
      (e) (i) determining whether, for the sample index value, the linear interpolation calculations have been completed for all Formants included in the read out frequency information and bandwidth information; and
      
      (e) (ii) if it is determined, in step (e) (i) that the linear interpolation calculations have been completed, then proceeding to filter, in step (f), the Formant contour and determining whether the sample index value, when incremented, is greater than the stored length of segmentation for the segmented linear Formant transition region.
  - 3. The method of claim 2, wherein the calculation of step (e) further includes the steps of:
    - (e)(iii) determining whether or not the present linear Formant transition region segment is a last linear Formant transition region segment stored corresponding to the input information of step (c);
      
      (e)(iv) returning to step (e)(00) to calculate the digital speech signal between a subsequent pair of points of transition corresponding to the next stored linear Formant transition region segment when the present linear Formant transition region segment is determined not to be the last linear Formant transition region segment in step (e)(iii); and
      
      (e)(v) completing the calculation of the digital speech signal corresponding to the input information of step (c) when the linear Formant transition region segment is determined to be the last stored linear Formant transition region segment in step (e) (iv).

4. A method of processing speech, comprising the steps of:
- (a) segmenting a speech frequency signal at points of transition into a plurality of time segments, each segment having a time length and each point of transition including at least one Formant of the speech frequency signal;
  
  (b) storing, for each Formant at each point of transition, one Formant frequency information and one bandwidth information; and
  
  (c) storing, for each segment, time length information corresponding to the time length of the segment obtained in said step (a).
- View Dependent Claims (5, 6, 7, 8, 9)
- - 5. The method of claim 4, wherein said step (a) determines respective time lengths according to points of linear characteristic change of the Formant'"'"'s frequency, the points of linear characteristic change corresponding to the points of transition.
  - 6. The method of claim 4, further comprising the steps of:
    - (d) reading, as first data, the stored Formant frequency information and the bandwidth information corresponding to a first point of transition;
      
      (e) reading, as second data, the stored Formant frequency information and the bandwidth information corresponding to a second point of transition; and
      
      (f) calculating a plurality of frequency and bandwidth values based upon the first and second data.
  - 7. The method of claim 6, wherein said step (f) includes the sub-steps of:
    - (f-1) determining a number of samples, n, to be calculated between the first and second data, the determination being based upon the stored time length information, Li, of a first time segment, i=1;
      (f-2) for at least the one Formant, j=1, calculating the number, n, of Formant frequency values, each Formant frequency value, F, being calculated according to;
      
      space="preserve" listing-type="equation">F=(F.sub.i+1,j -F.sub.i,j)n/L.sub.i
      for n=1 to n, where F_i+1,j and F_i,j correspond, at i=1 and j=1, to the Formant frequency information read in said steps (d) and (e); and
      (f-3) for at least the one Formant, j=1, calculating the number, n, of bandwidth values, each bandwidth value, BW, being calculated according to;
      
      space="preserve" listing-type="equation">BW=(BW.sub.i+1,j -BW.sub.i,j)n/L.sub.i
      for n=1 to n, where BW_i+1,j and BW_i,j correspond, at i=1 and j=1, to the bandwidth information read in said steps (d) and (e).
  - 8. The method of claim 7, wherein said sub-steps (f-1) to (f-3) are performed for each Formant stored at the first and second transition points.
  - 9. The method of claim 7, wherein additional time segments consecutively follow the first time segment, said method further comprising the step of:
    - (g) repeating said step (f) for subsequent pairs of points of transition corresponding to the additional time segments.

10. A method of synthesizing speech, comprising the steps of:
- (a) storing Formant information data for each of a plurality of Formants of a speech frequency signal, the Formant information data characterizing discrete points of transition between consecutive time segments of the speech frequency signal, the Formant information data including, for each point of transition, a single Formant frequency information and a single bandwidth information;
  
  (b) reading, for a first Formant, the stored Formant frequency information for a first point of transition and for a second point of transition; and
  
  (c) interpolating a plurality of frequency values between the read Formant frequency information of the first point of transition and the read Formant frequency information of the second point of transition.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. The method of claim 10, wherein said step (c) includes the sub-steps of:
    - (c-1) storing, for each time segment, a time length;
      
      (c-2) reading the stored time length, Li, corresponding to the first time segment, i=1;
      
      (c-3) determining, based upon the time length read in said step (c-2), a number of frequency values, n, to be interpolated;
      (c-4) interpolating, for the first Formant, the number, n, of frequency values, each frequency value, F, being determined according to;
      
      space="preserve" listing-type="equation">F=(F.sub.i+1 -F.sub.i)n/L.sub.i
      where n=1 to n for respective ones of the frequency values, and F_i+1 and F_i correspond to the frequency information for the second and first points of transition, respectively, read in said step (b).
  - 12. The method of claim 10, wherein the plurality of frequency values obtained in said step (c) together form a first digital signal, said method further comprising the steps of:
    - (d) reading, for the first Formant, the stored bandwidth information for the first point of transition and for the second point of transition; and
      
      (e) interpolating a plurality of bandwidth values between the bandwidth information of the first and second points of transition read in said step (d), thereby forming a second digital signal.
  - 13. The method of claim 12, wherein each of the frequency values obtained from said step (c) corresponds to a respective one of the bandwidth values obtained from said step (e), said method further comprising the steps of:
    - (f) for each frequency value and corresponding bandwidth value, filtering the frequency value and bandwidth value to produce a digital speech signal;
      
      (g) converting the digital speech signal to an analog speech signal; and
      
      (h) outputting the analog speech signal.
  - 14. The method of claim 13, wherein said step (h) includes the sub-step of:
    - (h-1) driving a speaker according to the analog speech signal.
  - 15. The method of claim 14, wherein said step (c) includes the sub-steps of:
    - (c-1) storing, for each time segment, a time length;
      
      (c-2) reading the stored time length, Li, corresponding to the first time segment, i=1;
      
      (c-3) determining, based upon the time length read in said sub-step (c-2), a number of frequency values, n, to be interpolated;
      (c-4) interpolating, for the first Formant, the number, n, of frequency values, each frequency value, F, being determined according to;
      
      space="preserve" listing-type="equation">F=(F.sub.i+1 -F.sub.i)n/L.sub.i
      where n=1 to n for respective ones of the frequency values, and F_i+1 and F_i correspond to the frequency information for the second and first points of transition, respectively, read in said step (b); and
      
      said step (e) includes the sub-step of;
      (e-1) interpolating, for the first Formant, the number, n, of bandwidth values, each bandwidth value, BW, being determined according to;
      
      space="preserve" listing-type="equation">BW=(BW.sub.i+1 -BW.sub.i)n/L.sub.i
      where n=1 to n for respective ones of the bandwidth values, and BW_i+1 and BW_i correspond to the bandwidth information for the second and first points of transition, respectively, read in said step (d).
  - 16. The method of claim 10, wherein the discrete time segments of said step (a) are segmented according to points of linear characteristic change of the Formants'"'"' frequencies.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Gold Star Co., Ltd.
Original Assignee
Gold Star Co., Ltd.
Inventors
Lee, Yoon-Keun
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
SMITS, TALIVALDIS IVARS

Application Number

US08/236,150
Time in Patent Office

1,170 Days
Field of Search

395/2, 395/2.67, 395/2.76, 395/2.77, 395/2.18, 395/2.74, 381/50-53
US Class Current

704/268
CPC Class Codes

G10L 13/02   Methods for producing synth...

G10L 21/0364   for improving intelligibility

G10L 25/15   the extracted parameters be...

Speech synthesizing method achieved by the segmentation of the linear Formant transition region

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

11 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesizing method achieved by the segmentation of the linear Formant transition region

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

11 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links