Method and device for speech signal pitch period estimation and classification in digital speech coders
First Claim
Patent Images
1. A method of speech signal coding, comprising the steps of:
- (a) dividing a speech signal to be coded into digital sample frames each containing the same number of samples;
(b) subjecting the samples of each frame to a predictive analysis for extracting from said signal parameters representative of long-term and short-term spectral characteristics and comprising at least a long-term analysis delay d, corresponding to a pitch period, and a long-term prediction coefficient b and gain G, and to a classification which indicates whether a respective frame corresponds to an active or inactive speech signal segment and for an active signal segment, whether the segment corresponds to a voiced or an unvoiced sound, a segment being considered as voiced if a respective prediction coefficient and gain are both greater than or equal to respective thresholds;
(c) providing information on said parameters to coding units for insertion into a coded signal, together with signals indicative of the classification for selecting in said coding units different coding methods according to characteristics of respective speech segments; and
(d) during said long-term analysis, estimating said delay is as a maximum of covariance function, weighted with a weighting function which reduces a probability that the period computed is a multiple of an actual period, inside a window with a length not less than a maximum value admitted for the delay, said thresholds for prediction coefficient and gain being thresholds which are adapted at each frame, in order to follow a background noise but not of the speech signal, adaptation of said thresholds being enabled only in active speech signal segments.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and a device for speech signal digital coding are provided where at each frame there is carried out a long-term analysis for estimating pitch period d and a long- term prediction coefficient b and gain G, and an a-priori classification of the signal as active/inactive and, for active signal, as voiced/unvoiced. Period estimation circuits (LT1) compute such period on the basis of a suitably weighted covariance function, and classification circuits (RV) distinguish voiced signals from unvoiced signals by comparing long-term prediction coefficient and gain with frame-by-frame variable thresholds.
-
Citations
13 Claims
-
1. A method of speech signal coding, comprising the steps of:
-
(a) dividing a speech signal to be coded into digital sample frames each containing the same number of samples; (b) subjecting the samples of each frame to a predictive analysis for extracting from said signal parameters representative of long-term and short-term spectral characteristics and comprising at least a long-term analysis delay d, corresponding to a pitch period, and a long-term prediction coefficient b and gain G, and to a classification which indicates whether a respective frame corresponds to an active or inactive speech signal segment and for an active signal segment, whether the segment corresponds to a voiced or an unvoiced sound, a segment being considered as voiced if a respective prediction coefficient and gain are both greater than or equal to respective thresholds; (c) providing information on said parameters to coding units for insertion into a coded signal, together with signals indicative of the classification for selecting in said coding units different coding methods according to characteristics of respective speech segments; and (d) during said long-term analysis, estimating said delay is as a maximum of covariance function, weighted with a weighting function which reduces a probability that the period computed is a multiple of an actual period, inside a window with a length not less than a maximum value admitted for the delay, said thresholds for prediction coefficient and gain being thresholds which are adapted at each frame, in order to follow a background noise but not of the speech signal, adaptation of said thresholds being enabled only in active speech signal segments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A device for speech signal digital coding, comprising:
-
means (TR) for dividing a sequence of speech signal digital samples into frames made up of a preset number of samples; means for speech signal predictive analysis (AS), comprising circuits (ST) for generating at each frame, parameters representative of short-term spectral characteristics and a residual signal of short-term prediction, and circuits (LT1, LT2) which obtain from the residual signal parameters representative of long-term spectral characteristics comprising a long-term analysis delay or pitch period d, and a long-term prediction coefficient b and a gain G; means for a-priori classification (CL) for recognizing whether a frame corresponds to an active speech period or to a silence period and whether an active speech period corresponds to a voiced or an unvoiced sound, the classification means (CL) comprising circuits (RA, RV) which generate a first and a second flag (A, V) for respectively signalling an active speech period and a voiced sound, and the circuits generating the second flag comprising means (CM1, CM2) for comparing the prediction coefficient and gain values with respective thresholds and emitting this flag when said values are both greater than the thresholds; and speech coding units (CV), which generate a coded signal by using at least some of the parameters generated by the predictive analysis means (AS), and are driven by said flags (A, V) in order to insert into the coded signal different information according to the nature of the speech signal in the frame, the circuits (LT1) for delay estimation computing said delay by maximizing a covariance function of a residual signal, computed inside a sample window with a length not lower than a maximum admissible value for the delay itself and weighted with a weighting function such as to reduce the probability that the maximum value computed is a multiple of the actual delay, and said comparison means (CM1, CM2) in the circuits (RV) generating the second flag (V) carrying out the comparison frame by frame with variable thresholds and being provided with means (CS1, CS2) for threshold generation, the comparison and threshold generation means being enabled only in the presence of the first flag. - View Dependent Claims (10, 11, 12, 13)
-
Specification