High frequency enhancement layer coding in wideband speech codec

US 6,615,169 B1
Filed: 10/18/2000
Issued: 09/02/2003
Est. Priority Date: 10/18/2000
Status: Expired due to Term

First Claim

Patent Images

1. A method of speech coding for encoding and decoding an input signal having active speech periods and non-active speech periods, and for providing a synthesized speech signal having higher frequency components and lower frequency components, wherein the input signal is divided into a higher frequency band and lower frequency band in encoding and speech synthesizing processes, and wherein speech related parameters characteristic of the lower frequency band are used to process an artificial signal for providing the higher frequency components of the synthesized speech, said method comprising the steps of:

scaling the processed artificial signal with a first scaling factor during the active speech periods, and scaling the processed artificial signal with a second scaling factor during the non-active speech periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal, and the second scaling factor is characteristic of the lower frequency band of the input signal.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech coding method and device for encoding and decoding an input signal and providing synthesized speech, wherein the higher frequency components of the synthesized speech are achieved by high-pass filtering and coloring an artificial signal to provide a processed artificial signal. The processed artificial signal is scaled by a first scaling factor during the active speech periods of the input signal and a second scaling factor during the non-active speech periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal and the second scaling factor is characteristic of the lower frequency band of the input signal. In particular, the second scaling factor is estimated based on the lower frequency components of the synthesized speech and the coloring of the artificial signal is based on the linear predictive coding coefficients characteristic of the lower frequency of the input signal.

Citations

25 Claims

1. A method of speech coding for encoding and decoding an input signal having active speech periods and non-active speech periods, and for providing a synthesized speech signal having higher frequency components and lower frequency components, wherein the input signal is divided into a higher frequency band and lower frequency band in encoding and speech synthesizing processes, and wherein speech related parameters characteristic of the lower frequency band are used to process an artificial signal for providing the higher frequency components of the synthesized speech, said method comprising the steps of:
- scaling the processed artificial signal with a first scaling factor during the active speech periods, and scaling the processed artificial signal with a second scaling factor during the non-active speech periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal, and the second scaling factor is characteristic of the lower frequency band of the input signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method of claim 1, wherein the processed artificial signal is high-pass filtered for providing a filtered signal in a frequency range characteristic of the higher frequency components of the synthesized speech.
  - 3. The method of claim 2, wherein the frequency range is in the 6.0-7.0 kHz range.
  - 4. The method of claim 1, wherein the input signal is high-pass filtered for providing a filtered signal in a frequency range characteristic of the higher frequency components of the synthesized speech, and wherein the first scaling factor is estimated from the filtered signal.
  - 5. The method of claim 4, wherein the non-active speech periods include speech hangover periods and comfort noise periods, wherein the second scaling factor for scaling the processed artificial signal in the speech hangover periods is estimated from the filtered signal.
  - 6. The method of claim 5, wherein the lower frequency components of the synthesized speech are reconstructed from the encoded lower frequency band of the input signal, and wherein the second scaling factor for scaling the processed artificial signal in the speech hangover periods is also estimated from the lower frequency components of the synthesized speech.
  - 7. The method of claim 6, wherein the second scaling factor for scaling the processed artificial signal in the comfort noise periods is estimated from the lower frequency components of the synthesized speech.
  - 8. The method of claim 7, wherein the second scaling factor for scaling the processed artificial signal in the comfort noise periods is indicative of a spectral tilt factor determined from the lower frequency components of the synthesized speech.
  - 9. The method of claim 6, further comprising transmitted an encoded bit stream to a receiving end for decoding, wherein the encoded bit stream includes data indicative of the first scaling factor.
  - 10. The method of claim 9, wherein the encoded bit stream includes data indicative of the second scaling factor for scaling the processed artificial signal in the speech hangover periods.
  - 11. The method of claim 9, wherein the second scaling factor for scaling the processed artificial signal is provided in the receiving end.
  - 12. The method of claim 6, wherein the second scaling factor is indicative of a spectral tilt factor determined from the lower frequency components of the synthesized speech.
  - 13. The method of claim 4, wherein the first scaling factor is further estimated from the processed artificial signal.
  - 14. The method of claim 1, further comprising the step of providing voice activity information based on the input signal for monitoring the active-speech periods and the non-active speech periods.
  - 15. The method of claim 1, wherein the speech related parameters include linear predictive coding coefficients characteristic of the lower frequency band of the input signal.

16. A speech signal transmitter and receiver system for encoding and decoding an input signal having active speech periods and non-active speech periods and for providing a synthesized speech signal having higher frequency components and lower frequency components, wherein the input signal is divided into a higher frequency band and a lower frequency band in the encoding and speech synthesizing processes, wherein speech related parameters characteristic of the lower frequency band of the input signal are used to process an artificial signal in the receiver for providing the higher frequency components of the synthesized speech, said system comprising:
- a decoder in the receiver for receiving an encoded bit stream from the transmitter, wherein the encoded bit stream contains the speech related parameters;
  
  a first means in the transmitter, responsive to the input signal, for providing a first scaling factor for scaling the processed artificial signal during the active periods, and a second means in the receiver, responsive to the encoded bit stream, for providing a second scaling factor for scaling the processed artificial signal during the non-active periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal and the second scaling factor is characteristic of the lower frequency band of the input signal.
- View Dependent Claims (17, 18, 19, 20, 21, 22)
- - 17. The system of claim 16, wherein the first means comprises a filtering means for high pass filtering the input signal and providing a filtered input signal having a frequency range corresponding to the higher frequency components of the synthesized speech, and wherein the first scaling factor is estimated from the filtered input signal.
  - 18. The system of claim 17, wherein the frequency range is in the 6.0-7.0 kHz range.
  - 19. The system of claim 17, further comprising a third means in the transmitter for providing a high-pass filtered random noise in the frequency range corresponding to the higher frequency components of the synthesized signal and for modifying the first scaling factor based on the high-pass filtered random noise.
  - 20. The system of claim 19, further comprising means, responsive to the first scaling factor, for providing an encoded first scaling factor and for included data indicative of the encoded first scaling factor into the encoded bit stream for transmitting.
  - 21. The system of claim 16, further comprising means, responsive to the input signal, for monitoring the active and non-active speech periods.
  - 22. The system of claim 16, further comprising means, responsive to the first scaling factor, for providing an encoded first scaling factor and for included data indicative of the encoded first scaling factor into the encoded bit stream for transmitting.

23. An encoder for encoding an input signal having active speech periods and non-active speech periods and the input signal is divided into a higher frequency band and a lower frequency band, and for providing an encoded bit stream containing speech related parameters characteristic of the lower frequency band of the input signal so as to allow a decoder to use the speech related parameters to process an artificial signal for providing the high frequency components of the synthesized speech, and wherein a scaling factor based on the lower frequency band of the input signal is used to scale the processed artificial signal during the non-active speech periods, said encoder comprising:
- means, responsive to the input signal, for high-pass filtering the input signal in a frequency range corresponding to the higher frequency components of the synthesized speech, and for providing a further scaling factor based on the high-pass filtered input signal; and
  
  means, responsive to the further scaling factor, for providing an encoded signal indicative of the first scaling factor into the encoded bit stream, so as to allow the decoder to receive the encoded signal and use the further scaling factor to scale the processed artificial signal during the active-speech periods.

24. A mobile station, which is arranged to transmit an encoded bit stream to a decoder for providing synthesized speech having higher frequency components and lower frequency components, wherein the encoded bit stream includes speech data indicative of an input signal having active speech periods and non-active periods, and the input signal is divided into a higher frequency band and lower frequency band, wherein the speech data includes speech related parameters characteristic of the lower frequency band of the input signal so as to allow the decoder to provide the lower frequency components of the synthesized speech based on the speech related parameters, and to color an artificial signal based on the speech related parameters and to scale the colored artificial signal with a scaling factor, based on the lower frequency components of the synthesized speech, for providing the high frequency components of the synthesized speech during the non-active speech periods, said mobile station comprising:
- a filter, responsive to the input signal, for high-pass filtering the input signal in a frequency range corresponding to the higher frequency components of the synthesized speech, and for providing a further scaling factor based on the high-pass filtered input signal; and
  
  a quantization module, responsive to the scaling factor and the further scaling factor, for providing an encoded signal indicative of the further scaling factor in the encoded bit stream, so as to allow the decoder to scale the colored artificial signal during the active-speech period based on the further scaling factor.

25. An element of a telecommunication network, which is arranged to receive an encoded bit stream containing speech data indicative of an input signal from a mobile station for providing synthesized speech, having higher frequency components and lower frequency components, wherein the input signal having active speech periods and non-active periods, and the input signal are divided into a higher frequency band and lower frequency band, wherein the speech data includes speech related parameters characteristic of the lower frequency band of the input signal, said element comprising:
- a first mechanism, responsive to the speech data, for providing the lower frequency components of the synthesized speech based on the speech related parameters, and for providing a first signal indicative of the lower frequency components of the synthesized speech;
  
  a second mechanism, responsive to the speech data, for synthesis and high-pass filtering an artificial signal for providing a second signal indicative of the synthesis and high-pass filtered artificial signal;
  
  a third mechanism, responsive to the first signal, for providing a first scaling factor based on the lower frequency components of the synthesized speech; and
  
  a forth mechanism, responsive to the encoded bit stream, for providing a second scaling factor based on gain parameters characteristic of the higher frequency band of the input signal, wherein the gain parameters are included in the encoded bit stream; and
  
  a fifth mechanism, responsive to the second signal, for scaling the synthesis and high-pass filtered artificial signal with the first and second scaling factors during non-active speech periods and active speech periods, respectively.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nokia Technologies Oy (Nokia Corporation)
Original Assignee
Nokia Corporation
Inventors
Ojala, Pasi, Vainio, Janne, Mikkola, Hannu, Rotola-Pukkila, Jani
Primary Examiner(s)
Knepper, David D.

Application Number

US09/691,440
Time in Patent Office

1,049 Days
Field of Search

704/200, 704/200.1, 704/205-210, 704/219-230
US Class Current

704/205
CPC Class Codes

G10L 19/012   Comfort noise or silence co...

G10L 21/0364   for improving intelligibility

G10L 25/78   Detection of presence or ab...

High frequency enhancement layer coding in wideband speech codec

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

High frequency enhancement layer coding in wideband speech codec

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links