Method and apparatus in a telecommunications system
First Claim
1. A method of improving speech quality in a communication system comprising a first terminal unit (TRX1), which transmits speech signals having a first sampling frequency (F1) and a second terminal (TRX2), which receives the speech signals, the method performed at the second terminal and comprising:
- receiving said speech signals;
decoding the received speech frame;
buffering said decoded speech frame in a playout buffer of said second terminal (TRX2);
performing a dynamic sample rate conversion of said decoded speech frame comprising N samples on a sample by sample basis, said dynamic sample rate conversion comprising;
creating a first LPC-residual excitation frame comprising N samples derived from said decoded speech frame;
calculating whether a sample should be either added or removed from said first LPC-residual excitation frame;
selecting, in response to a determination that said calculating so demands, the position where in said first LPC-residual excitation frame to add or remove a sample;
generating a second modified LPC-residual excitation frame comprising at least one of N−
1 and N+1 samples, in response to a determination that said calculating so demands; and
synthesizing, in response to a determination that said calculating so demands, a second speech frame from said second modified LPC-residual excitation frame; and
playing out, in response to a determination that said calculating so demands, said second speech frame from said play out buffer.
1 Assignment
0 Petitions
Accused Products
Abstract
Audio artifacts due to overrun or underrun in a playout buffer caused by the sampling rates at a sending and receiving side not being at the same rate are reduced. An LPC-residual is modified on a sample-by-sample basis. The LPC-residual block, which includes N samples, is converted to a block comprising N+1 or N−1 samples. A sample rate controller decides whether samples should be added to or removed from the LPC-residual. The exact position at which to add respective remove samples is either chosen arbitrarily or found by searching for low energy segments in the LPC-residual. A speech synthesiser module then reproduces the speech. By using the proposed sample rate conversion method the playout buffer can be continuously controlled. Furthermore, since the method works on a sample-by-sample basis the buffer can be kept to a minimum and hence no extra delay is introduced.
38 Citations
43 Claims
-
1. A method of improving speech quality in a communication system comprising a first terminal unit (TRX1), which transmits speech signals having a first sampling frequency (F1) and a second terminal (TRX2), which receives the speech signals, the method performed at the second terminal and comprising:
-
receiving said speech signals;
decoding the received speech frame;
buffering said decoded speech frame in a playout buffer of said second terminal (TRX2);
performing a dynamic sample rate conversion of said decoded speech frame comprising N samples on a sample by sample basis, said dynamic sample rate conversion comprising;
creating a first LPC-residual excitation frame comprising N samples derived from said decoded speech frame;
calculating whether a sample should be either added or removed from said first LPC-residual excitation frame;
selecting, in response to a determination that said calculating so demands, the position where in said first LPC-residual excitation frame to add or remove a sample;
generating a second modified LPC-residual excitation frame comprising at least one of N−
1 and N+1 samples, in response to a determination that said calculating so demands; and
synthesizing, in response to a determination that said calculating so demands, a second speech frame from said second modified LPC-residual excitation frame; and
playing out, in response to a determination that said calculating so demands, said second speech frame from said play out buffer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. An apparatus for improving speech quality in a communication system comprising a first terminal unit (TRX1) which transmits speech signals having a first sampling frequency (F1) and a second terminal unit (TRX2), which receives said speech signals, said apparatus comprising:
-
means for receiving said speech signals;
means for decoding the received speech frame;
means for buffering said decoded speech frame in a playout buffer of said second terminal (TRX2);
means for performing a dynamic sample rate conversion of said decoded speech frame comprising N samples on an sample by sample basis, wherein said means for performing said dynamic sample rate conversion comprises;
means for creating a first LPC-residual excitation frame comprising N samples derived from said speech time;
means for calculating whether a sample should be added or removed from first said LPC-residual excitation frame;
means for selecting, in response to a determination that said calculating so demands, the position where in said first LPC-residual excitation frame to add or remove a sample;
means for generating a second modified LPC-residual excitation frame comprising at least one of N−
1 and N+1 samples in response to a determination that said calculating so demands; and
means for synthesizing a second speech frame from said second modified LPC-residual excitation frame in response to a determination that said calculating so demands; and
means for playing out said second speech frame from said play out buffer in response to a determination that said calculating so demands. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
Specification