Method of recognizing speech pauses
First Claim
1. A method of detecting speech pauses from the short-time spectrum of a speech signal which may be disturbed by noise signals superposed on it, characterized in that at each clock instant τ
- (n) of a central clock(a) a set W(n) consisting of M Fourier coefficients Y1(n), Y2(n) . . . YM(n) of the short-time spectrum of the disturbed speech signal is determined from digital samples of such signal,(b) from the M Fourier coefficients of the set W(n), and the NM Fourier coefficients of all of the sets W(n-1), W(n-2) . . . W(n-N) of such coefficients at N prior clock instants, the short-time mean value G(n) of all such Fourier coefficients is determined,(c) the noise signal power P(n) is estimated as a function of an estimate P(n-1) thereof at the preceding clock instant and of the short-time mean value G(n),(d) a smoothed short-time value GG(n) is determined as a function of the short-time mean value G(n) at clock instant τ
(n) and the short-time mean values at a plurality of preceding clock instants,(e) if the smoothed short-time mean value GG(n) several times in succession falls below a first threshold (S) proportional to the estimated noise signal power P(n), a signal is produced indicating the presence of a speech pause.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of recognizing speech pauses in a speech signal even when the signal is disturbed by a slowly varying noise signal superposed thereon. Mean values which are an approximate measure of the average power of successive sections of the disturbed signal are determined from the short-time Fourier coefficients of the disturbed speech signal. The sequential short-time mean values are then smoothed by a linear digital filter or a median filter. An estimate of the noise signal power averaged over a few seconds is also recovered from the sequence of short-time mean values. A speech pause is signified when the smoothed short-time mean value (output of GL) more than once falls to a threshold which is proportional to the estimated noise power (output of PA).
-
Citations
8 Claims
-
1. A method of detecting speech pauses from the short-time spectrum of a speech signal which may be disturbed by noise signals superposed on it, characterized in that at each clock instant τ
- (n) of a central clock
(a) a set W(n) consisting of M Fourier coefficients Y1(n), Y2(n) . . . YM(n) of the short-time spectrum of the disturbed speech signal is determined from digital samples of such signal, (b) from the M Fourier coefficients of the set W(n), and the NM Fourier coefficients of all of the sets W(n-1), W(n-2) . . . W(n-N) of such coefficients at N prior clock instants, the short-time mean value G(n) of all such Fourier coefficients is determined, (c) the noise signal power P(n) is estimated as a function of an estimate P(n-1) thereof at the preceding clock instant and of the short-time mean value G(n), (d) a smoothed short-time value GG(n) is determined as a function of the short-time mean value G(n) at clock instant τ
(n) and the short-time mean values at a plurality of preceding clock instants,(e) if the smoothed short-time mean value GG(n) several times in succession falls below a first threshold (S) proportional to the estimated noise signal power P(n), a signal is produced indicating the presence of a speech pause. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- (n) of a central clock
Specification