Method and apparatus for artificial bandwidth expansion in speech processing
First Claim
1. A method of improving speech in a plurality of signal segments having speech signals in a time domain, said method characterized by upsampling the signal segments for providing upsampled segments in the time domain;
- converting the upsampled segments into a plurality of transformed segments having speech spectra in a frequency domain;
classifying the speech signals into a plurality of classes based on at least one signal characteristic of the speech signals;
modifying the speech spectra in the frequency domain based on the classes for providing modified transformed segments; and
converting the modified transformed segments into speech data in the time domain.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and device for improving the quality of speech signals transmitted using an audio bandwidth between 300 Hz and 3.4 kHz. After the received speech signal is divided into frames, zeros are inserted between samples to double the sampling frequency. The level of these aliased frequency components is adjusted using an adaptive algorithm based on the classification of the speech frame. Sound can be classified into sibilants and non-sibilants, and a non-sibilant sound can be further classified into a voiced sound and a stop consonant. The adjustment is based on parameters, such as the number of zero-crossings and energy distribution, computed from the spectrum of the up-sampled speech signal between 300 Hz and 3.4 kHz. A new sound with a bandwidth between 300 Hz and 7.7 kHz is obtained by inverse Fourier transforming the spectrum of the adjusted, up-sampled sound.
74 Citations
32 Claims
-
1. A method of improving speech in a plurality of signal segments having speech signals in a time domain, said method characterized by
upsampling the signal segments for providing upsampled segments in the time domain; -
converting the upsampled segments into a plurality of transformed segments having speech spectra in a frequency domain;
classifying the speech signals into a plurality of classes based on at least one signal characteristic of the speech signals;
modifying the speech spectra in the frequency domain based on the classes for providing modified transformed segments; and
converting the modified transformed segments into speech data in the time domain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A network device in a telecommunications network, wherein the network device is capable of
receiving data indicative of speech; - and
partitioning the received data into a plurality of signal segments having speech signals in a time domain, said network device characterized by an upsampling module for upsampling the signal segments for providing upsampled segments in the time domain;
a transform module for converting the upsampled segments into a plurality of transformed segments having speech spectra in a frequency domain;
a classification algorithm for classifying the speech signals into a plurality of classes based on at least one signal characteristic of the speech signals; and
an adjustment algorithm for modifying the speech spectra in the frequency domain based on the classes for providing modified transformed segments. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- and
-
25. A sound classification algorithm for use in a speech decoder, wherein speech data in the speech decoder is partitioned into a plurality of signal segments having speech signals in a time domain and each signal segment includes a number of signal samples, and wherein the speech signals include a time waveform having a plurality of crossing points on a time axis, said classification algorithm characterized by
classifying the speech signals into a plurality of classes based on a ratio of the number of crossing points and the number of signal samples in at least one signal segment.
-
30. A spectral adjustment algorithm for use in a speech decoder capable of
receiving speech data, partitioning speech data into a plurality of signal segments having speech signals in the time domain, upsampling the signal segments for providing upsampled segments, and converting the upsampled segments into a plurality of transformed segments, each having a first speech spectral portion in a first frequency range and a second speech spectral portion in a second frequency range higher than the first frequency range, said adjustment algorithm characterized by enhancing the second speech spectral portion, if the speech signals are classified as a sibilant class, and attenuating the second speech spectral portion, if the speech signals are classified as a non-sibilant class.
Specification