Method and apparatus to detect and delimit foreground speech
First Claim
Patent Images
1. A method for processing data in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising the steps, performed by a processor, ofextracting a channel signal;
- generating a mask signal from the channel signal;
masking the extracted channel signal with the mask signal; and
taking a sample standard deviation of the masked channel signal over a temporal window; and
generating foreground speech endpoints using the sample standard deviation determined during said taking step.
15 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides improved foreground-speech signal endpointing by computing a spectral stationarity statistic. This statistic is used by a finite state machine to endpoint speech. Endpointing using the spectral stationarity statistic is less susceptible to background noise than endpointing using conventional measures. The present invention uses frame-synchronous quantile estimation to generate a mask signal for signal to Noise Ratio Normalization.
-
Citations
24 Claims
-
1. A method for processing data in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising the steps, performed by a processor, of
extracting a channel signal; -
generating a mask signal from the channel signal; masking the extracted channel signal with the mask signal; and taking a sample standard deviation of the masked channel signal over a temporal window; and generating foreground speech endpoints using the sample standard deviation determined during said taking step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising:
-
means for extracting a channel signal; means for generating a mask signal from the channel signal; means for masking the extracted channel signal using the generated mask signal; and means for taking a sample standard deviation of the masked channel signal over a temporal window, and means for generating foreground speech endpoints using the sample standard deviation determined by said means for taking. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A computer program product comprising:
a computer usable medium having computer readable code embodied therein for processing data in a voice recognition system, the computer usable medium comprising an extracting module configured to extract a channel energy signal; a mask generating module configured to generate a mask signal from the channel energy signal; a masking module configured to mask the extracted channel energy signal with the generated mask signal; and a standard deviation module configured to take a sample standard deviation of the masked extracted channel energy signal over a temporal window, and an end point generating module configured to generate foreground speech endpoints using the sample standard deviation determined by said standard deviation module. - View Dependent Claims (23, 24)
Specification