Frequency domain postfiltering for quality enhancement of coded speech

US 6,941,263 B2
Filed: 06/29/2001
Issued: 09/06/2005
Est. Priority Date: 06/29/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method of postfiltering a speech signal using linear predictive coefficients of the speech signal for enhancing human perceptual quality of the speech signal, the method comprising the steps of:

generating a postfilter by performing a non-linear transformation the linear predictive coefficients spectrum in the frequency domain;

applying the generated postfilter to the synthesized speech signal in the frequency domain; and

transforming the filtered frequency domain synthesized speech signal into a speech signal in the time domain;

wherein the step of generating a postfilter further comprises the steps of;

representing the linear predictive coefficients spectrum by a time domain vector;

transforming the time domain vector into a frequency domain vector by a Fourier transformation;

inversing the frequency domain vector; and

calculating gains according to the magnitude of the all-pole model vector, wherein the gains include a magnitude and a phase response.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system of performing postfiltering in the frequency domain to improve the quality of a speech signal, especially for synthesized speech resulting from codecs of low bit-rate, is provided. The method comprises LPC tilt computation and compensation methods and modules, a formant filter gain computation method and module, and an anti-aliasing method and module. The formant filter gain calculation employs an LPC representation, an all-pole modeling, a non-linear transformation and a phase computation. The LPC used for deriving the postfilter may be transmitted from an encoder or may be estimated from a synthesized or other speech signal in a decoder or receiver. The invention may be implemented in a linked decoder and encoder. A separate LPC evaluation unit that is responsible for processing and or deriving the LPC may be implemented within the invention.

34 Citations

View as Search Results

18 Claims

1. A method of postfiltering a speech signal using linear predictive coefficients of the speech signal for enhancing human perceptual quality of the speech signal, the method comprising the steps of:
- generating a postfilter by performing a non-linear transformation the linear predictive coefficients spectrum in the frequency domain;
  
  applying the generated postfilter to the synthesized speech signal in the frequency domain; and
  
  transforming the filtered frequency domain synthesized speech signal into a speech signal in the time domain;
  
  wherein the step of generating a postfilter further comprises the steps of;
  
  representing the linear predictive coefficients spectrum by a time domain vector;
  
  transforming the time domain vector into a frequency domain vector by a Fourier transformation;
  
  inversing the frequency domain vector; and
  
  calculating gains according to the magnitude of the all-pole model vector, wherein the gains include a magnitude and a phase response.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the step of calculating the gains further comprises the steps of:
    - normalizing the magnitude of the all-pole model vector;
      
      conducting a non-linear transformation for the normalized magnitude of the all-pole model vector to obtain the magnitude of the gains;
      
      estimating the phase response of the gains; and
      
      forming the gains by combining the magnitude and the estimated phase response of the gains.
  - 3. The method of claim 2, wherein the step of estimating the phase response further comprises executing a fast Fourier transformation based phase shifter on the gains.
  - 4. The method of claim 2, wherein the non-linear transformation function comprises a scaling function with a scaling factor between 0 and 1.
  - 5. The method of claim 1, wherein the step of generating a postfilter further comprises executing an anti-aliasing procedure in the time domain after the step of calculating the gains.
  - 6. The method of claim 1, wherein the all-pole model is represented by a logarithm of the inverse magnitude of the frequency domain linear predictive coefficients vector.

7. A computer-readable medium having computer-readable instructions for performing steps to postfilter a synthesized speech signal using the linear predictive coefficients spectrum of the speech signal comprising the steps of:
- computing the tilt of the linear predictive coefficients spectrum;
  
  compensating the linear predictive coefficients spectrum using the computed tilt;
  
  generating a postfilter by executing a non-linear transformation of the compensated linear predictive coefficients spectrum in the frequency domain; and
  
  applying the generated postfilter to the synthesized speech signal in the frequency domain;
  
  wherein the step of generating a postfilter further comprises the steps of;
  
  representing the linear predictive coefficients by a time domain vector;
  
  transforming the time domain vector into a frequency domain vector by a Fourier transformation;
  
  transferring the frequency domain vector into an all-pole model vector; and
  
  calculating gains according to the magnitude of the all-pole model vector, wherein the gains include a magnitude and phase response.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The computer-readable medium of claim 7, wherein step of calculating the gains further comprises the steps of:
    - normalizing the magnitude of the all-pole model vector;
      
      conducting a non-linear transformation for the normalized magnitude of the all-pole model vector to obtain the magnitude of the gains;
      
      estimating the phase response of the gains; and
      
      forming the gains by combining the magnitude and the estimated phase response of the gains.
  - 9. The computer-readable medium of claim 8, wherein the step of estimating the phase response further comprises executing a fast Fourier transformation based phase shifter.
  - 10. The computer-readable media of claim 8, wherein the non-linear transformation function comprises a scaling function with a scaling factor between 0 and 1.
  - 11. The computer-readable medium of claim 7, wherein the all-pole model is represented by a logarithm of the inverse magnitude of the frequency domain vector.

12. A computer-readable medium having computer-readable instructions for performing steps to postfilter a synthesized speech signal using the linear predictive coefficients spectrum of the speech signal comprising the steps of:
- computing the tilt of the linear predictive coefficients spectrum;
  
  compensating the linear predictive coefficients spectrum using the computed tilt;
  
  generating a postfilter by executing a non-linear transformation of the compensated linear predictive coefficients spectrum in the frequency domain and executing an anti-aliasing procedure in the time domain; and
  
  applying the generated postfilter to the synthesized speech signal in the frequency domain.

13. An apparatus for postfiltering a speech signal using a plurality of linear predictive coefficients of the speech signal for enhancing human perceptual quality of the speech signal, the apparatus comprising:
- a Fourier transformation module operable for conducting a Fourier transformation;
  
  an inverse Fourier transformation module operable for conducting inverse Fourier transformation; and
  
  a formant filter comprising formant filter gains, wherein the gains are calculated in the frequency domain by performing a non-linear transformation of the linear predictive coefficients;
  
  wherein the formant filter further comprises;
  
  a linear predictive coefficients tilt computation module for computing the tilt of the linear predictive coefficients spectrum;
  
  a linear predictive coefficients tilt compensation module for compensating the linear predictive coefficients according to the computed tilt of the linear predictive coefficients spectrum;
  
  a formant gain calculation module for calculating formant filter gains in the frequency domain by performing a non-linear transformation of the linear predictive coefficients after tilt compensation, wherein the gains include a magnitude and phase response; and
  
  a gain application module for applying the format filter gains to a speech signal by multiplying the gains and the speech signal in the frequency domain.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The apparatus of claim 13, wherein the formant gain calculation module further comprises:
    - a linear predictive coefficients representation module for representing the linear predictive coefficients by a time domain vector;
      
      a modeling module for modeling a frequency domain vector according to a predefined model for generating a magnitude, wherein the frequency domain vector is transformed from the time domain vector representing the LPC coefficients;
      
      a linear predictive coefficients non-linear transformation module for performing a non-linear transformation on the magnitude and producing the magnitude of the formant filter gains;
      
      a phase computation module for computing a phase response of the formant filter gains according to the magnitude of the model after non-linear transformation;
      
      a formant filter gain combination module for combining the magnitude and the phase response of the formant filter gain; and
      
      an anti-aliasing module for preventing aliasing caused by application of the formant filter.
  - 15. The apparatus of claim 14, wherein the line predictive coefficients representation module is adapted for representing the linear predictive coefficients by a zero-padding technique.
  - 16. The apparatus of claim 14, wherein the line predictive coefficients non-linear transformation module further comprises a scaling function with a scaling factor of between 0 and 1.
  - 17. The apparatus of claim 14, wherein the phase computation module further comprises a Hilbert phase shifter in the time domain.

18. An apparatus for use with a postfilter for processing linear predictive coefficients of a signal and providing a frequency domain formant filter gains for a formant filter, the apparatus comprising:
- a linear predictive coefficients tilt computation module for computing the tilt of the linear predictive coefficients;
  
  a linear predictive coefficients tilt compensation module for compensating the linear predictive coefficients spectrum according to the computed tilt of the linear predictive coefficients spectrum; and
  
  a formant filter gain computation module for calculating the frequency domain formant filter gains according to the linear predictive coefficients, wherein the gains include a magnitude and a phase response.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Khalil, Hosam A., Wang, Hong, Cuperman, Vladimir, Gersho, Allen
Primary Examiner(s)
Lerner, Martin

Application Number

US09/896,062
Publication Number

US 20030009326A1
Time in Patent Office

1,530 Days
Field of Search

704/225, 704/203, 704/205, 704/206, 704/209, 704/219, 704/224
US Class Current

704/219
CPC Class Codes

G10L 19/26 Pre-filtering or post-filte...

G10L 21/0364 for improving intelligibility

Frequency domain postfiltering for quality enhancement of coded speech

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

34 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Frequency domain postfiltering for quality enhancement of coded speech

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

34 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links