Detecting transients to emphasize formant peaks

US 5,953,696 A
Filed: 09/23/1997
Issued: 09/14/1999
Est. Priority Date: 03/10/1994
Status: Expired due to Term

First Claim

Patent Images

1. A speech signal processing method for decoding a speech signal encoded by a speech encoding method in which a speed signal is represented by parameters in at least a frequency domain, comprising the steps of:

smoothing on the frequency axis a signal representing an intensity of the frequency spectrum;

comparing a signal representing an intensity of the frequency spectrum with the smoothed version of the signal obtained in the smoothing step;

taking the difference between the signal representing the intensity of the spectrum and the version of said signal obtained on smoothing on the frequency axis;

performing a processing of deepening valley portions between formants of a transmitted frequency spectrum using the results of the comparing step;

wherein said step of processing of deepening the valley portions between the formants of the frequency spectrum is performed using the result of the step of taking the difference between the signal representing the intensity of the spectrum and the version of said signal obtained on smoothing on the frequency axis.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Nasalized sound effects during reproduction of low-pitch sounds are suppressed to produce playback sounds of high clarity. Amplitude data is processed with high range formant emphasis of crests and valleys of the envelope of the frequency spectrum on the high frequency range and with deepening of the valley of the frequency spectrum over the entire frequency range, above all, over the low to mid frequency range. Next, the amplitude data is processed for emphasizing the peak values of the formant of the voiced frame in the portion of the speech signal which is rising in magnitude and for unconditionally emphasizing the spectral envelope on the high frequency range. The voiced speech spectrum is generated by synthesizing the cosine wave based upon the emphasized amplitude data.

51 Citations

View as Search Results

7 Claims

1. A speech signal processing method for decoding a speech signal encoded by a speech encoding method in which a speed signal is represented by parameters in at least a frequency domain, comprising the steps of:
- smoothing on the frequency axis a signal representing an intensity of the frequency spectrum;
  
  comparing a signal representing an intensity of the frequency spectrum with the smoothed version of the signal obtained in the smoothing step;
  
  taking the difference between the signal representing the intensity of the spectrum and the version of said signal obtained on smoothing on the frequency axis;
  
  performing a processing of deepening valley portions between formants of a transmitted frequency spectrum using the results of the comparing step;
  
  wherein said step of processing of deepening the valley portions between the formants of the frequency spectrum is performed using the result of the step of taking the difference between the signal representing the intensity of the spectrum and the version of said signal obtained on smoothing on the frequency axis.
- View Dependent Claims (2, 3)
- - 2. The speech signal processing method as claimed in claim 1 wherein an amount of attenuation of deepening of said valley portions between the formants of the frequency spectrum is varied depending on the magnitude of said difference.
  - 3. The speech signal processing method as claimed in claim 1 comprising the further steps of:
    - discriminating whether the signal indicating the intensity of the transmitted frequency spectrum is of a voiced domain or an unvoiced domain andperforming said processing only when the signal is of the voiced domain.

4. A speech signal processing method employed in a speech synthesis system centered about processing in the frequency domain, comprising the steps of:
- dividing the speech signal into a plurality of frames;
  
  calculating an energy of the speech signal for each of the frames sequentially;
  
  comparing the calculated energy of the current frame with the calculated energy of the previous frame in order to detect a transient portion where speech energy rapidly increases in the time domain; and
  
  emphasizing formant peaks of the frequency spectrum in the detected transient portion by directly acting on frequency domain parameters when the transient portion is detected in the comparing step.
- View Dependent Claims (5, 6)
- - 5. The speech signal processing method as claimed in claim 4, further comprising the steps of:
    - discriminating whether the speech signal is of a voiced domain or an unvoiced domain; and
      
      carrying out said emphasizing of the formant peak only for a voiced domain.
  - 6. The speech signal processing method as claimed in claim 4 wherein said emphasizing is carried out only on a low-range side of the frequency spectrum.

7. A speech signal processing method for decoding a speech signal encoded by a speech encoding method in which a speech signal is represented by parameters in at least a frequency domain, comprising the steps of:
- smoothing on the frequency axis a signal representing an intensity of the frequency spectrum;
  
  comparing a signal representing an intensity of the frequency spectrum and the smoothed version of the signal obtained in the smoothing step; and
  
  performing a processing of deepening valley portion between format of a transmitted frequency spectrum using the result of the comparing steps,wherein said smoothing step is carried out by taking moving averages obtained by averaging spectrum intensity values in predetermined frequency windows successively defined in frequency domain.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.)
Inventors
Nishiguchi, Masayuki, Matsumoto, Jun
Primary Examiner(s)
Knepper, David D.

Application Number

US08/935,695
Time in Patent Office

721 Days
Field of Search

704/207-209, 704/224-226
US Class Current

704/209
CPC Class Codes

G10L 21/0264   characterised by the type o...

G10L 21/0364   for improving intelligibility

G10L 25/15   the extracted parameters be...

H04R 2225/43   Signal processing in hearin...

Detecting transients to emphasize formant peaks

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

51 Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Detecting transients to emphasize formant peaks

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

51 Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links