Packet Loss Concealment for Speech Coding
First Claim
1. A method of improving packet loss concealment for speech coding while still profiting from a pitch prediction or Long-Term Prediction (LTP), the method comprising:
- classifying a plurality of speech frames into a plurality of classes, andwherein at least for one of the classes, the following steps are included;
comparing a pitch cycle length with a subframe size within a speech frame when the subframe size is fixed or deciding a first subframe size based on a pitch cycle length within a speech frame when the first subframe size is variable;
having an LTP excitation component;
having a second excitation component;
determining an initial energy of the LTP excitation component for every subframe within a frame of speech signal by using a regular method of minimizing a coding error or a weighted coding error at an encoder;
reducing or limiting the energy of the LTP excitation component to be smaller than the initial energy of the LTP excitation component for the first subframe or the first two subframes within the frame based at least in part on the pitch cycle length compared to the subframe size;
keeping the energy of the LTP excitation component to be equal to the initial energy of the LTP excitation component for any other subframe rather than the first subframe or the first two subframes within the frame;
encoding the energy of the LTP excitation component for every subframe of the frame at the encoder; and
forming an excitation by including the LTP excitation component and the second excitation component.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class. A pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. A strongly voiced class is decided by checking if the pitch lags are stable and the pitch gains are high enough with the frame; for the strongly voiced frame, the pitch lags and the pitch gains can be encoded more efficiently than other speech classes.
-
Citations
17 Claims
-
1. A method of improving packet loss concealment for speech coding while still profiting from a pitch prediction or Long-Term Prediction (LTP), the method comprising:
-
classifying a plurality of speech frames into a plurality of classes, and wherein at least for one of the classes, the following steps are included; comparing a pitch cycle length with a subframe size within a speech frame when the subframe size is fixed or deciding a first subframe size based on a pitch cycle length within a speech frame when the first subframe size is variable; having an LTP excitation component; having a second excitation component; determining an initial energy of the LTP excitation component for every subframe within a frame of speech signal by using a regular method of minimizing a coding error or a weighted coding error at an encoder; reducing or limiting the energy of the LTP excitation component to be smaller than the initial energy of the LTP excitation component for the first subframe or the first two subframes within the frame based at least in part on the pitch cycle length compared to the subframe size; keeping the energy of the LTP excitation component to be equal to the initial energy of the LTP excitation component for any other subframe rather than the first subframe or the first two subframes within the frame; encoding the energy of the LTP excitation component for every subframe of the frame at the encoder; and forming an excitation by including the LTP excitation component and the second excitation component. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of efficiently encoding a voiced frame, the method comprising:
-
classifying a plurality of speech frames into a plurality of classes, and wherein at least for one of the classes, the following steps are included; having a Long-Term Prediction (LTP) excitation component; having a second excitation component; encoding an energy of the LTP excitation component by encoding a pitch gain; checking whether a pitch track or pitch lags within the voiced frame are stable from one subframe to a next subframe; checking whether the voiced frame is strongly voiced by checking whether pitch gains within the voiced frame are high; encoding the pitch lags or the pitch gains efficiently by a differential coding from one subframe to a next subframe when the voiced frame is strongly voiced and the pitch lags are stable; and forming an excitation by including the LTP excitation component and the second excitation component. - View Dependent Claims (9, 10)
-
-
11. A non-transitory computer-readable medium having computer implementable instructions stored thereon for execution by a processor, wherein the instructions are executed to implement a method of improving packet loss concealment for speech coding while still profiting from a pitch prediction or Long-Term Prediction (LTP), the method comprising:
-
classifying a plurality of speech frames into a plurality of classes, and wherein at least for one of the classes, the following steps are included; comparing a pitch cycle length with a subframe size within a speech frame when the subframe size is fixed or deciding a first subframe size based on a pitch cycle length within a speech frame when the first subframe size is variable; having an LTP excitation component; having a second excitation component; determining an initial energy of the LTP excitation component for every subframe within a frame of speech signal by using a regular method of minimizing a coding error or a weighted coding error at an encoder; reducing or limiting the energy of the LTP excitation component to be smaller than the initial energy of the LTP excitation component for the first subframe or the first two subframes within the frame based at least in part on the pitch cycle length compared to the subframe size; keeping the energy of the LTP excitation component to be equal to the initial energy of the LTP excitation component for any other subframe rather than the first subframe or the first two subframes within the frame; encoding the energy of the LTP excitation component for every subframe of the frame at the encoder; and forming an excitation by including the LTP excitation component and the second excitation component. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification