Method for utilizing validity constraints in a speech endpoint detector
First Claim
1. A system for detecting endpoints of an utterance, comprising:
- a processor configured to manipulate speech energy corresponding to said utterance;
a filter bank which band-passes said speech energy before providing said speech energy to, an endpoint detector that is responsive to said processor, said endpoint detector analyzing said speech energy in real time by progressively examining frames of said speech energy in sequence to determine threshold values and energy parameters, said energy parameters being short-term energy parameters corresponding to said frames of said speech energy, said short-term energy parameters being calculated using a following equation;
where wi(m) is a respective weighting value, yi(m) is channel signal energy of a channel m at a frame i, and M is a total number of channels of said filter bank, said endpoint detector smoothing said short-term energy parameters by using a multiple-point median filter, said endpoint detector using a starting threshold and said short-term energy parameters to determine a starting point for a reliable island, said speech energy including at least one reliable island in which said short-term energy parameters are greater than said starting threshold and an ending threshold, said endpoint detector calculating a background noise value, said background noise value being derived from said short-term energy parameters during a background noise period, said background noise period ending at least 250 milliseconds ahead of said reliable island and having a normalized deviation that is less than a predetermined value, said endpoint detector comparing said threshold values with said energy parameters to identify a beginning point and an ending point of said utterance; and
a validity manager, responsive to said processor, for analyzing said speech energy according to selectable criteria to thereby verify said utterance.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
-
Citations
8 Claims
-
1. A system for detecting endpoints of an utterance, comprising:
-
a processor configured to manipulate speech energy corresponding to said utterance;
a filter bank which band-passes said speech energy before providing said speech energy to, an endpoint detector that is responsive to said processor, said endpoint detector analyzing said speech energy in real time by progressively examining frames of said speech energy in sequence to determine threshold values and energy parameters, said energy parameters being short-term energy parameters corresponding to said frames of said speech energy, said short-term energy parameters being calculated using a following equation;
where wi(m) is a respective weighting value, yi(m) is channel signal energy of a channel m at a frame i, and M is a total number of channels of said filter bank, said endpoint detector smoothing said short-term energy parameters by using a multiple-point median filter, said endpoint detector using a starting threshold and said short-term energy parameters to determine a starting point for a reliable island, said speech energy including at least one reliable island in which said short-term energy parameters are greater than said starting threshold and an ending threshold, said endpoint detector calculating a background noise value, said background noise value being derived from said short-term energy parameters during a background noise period, said background noise period ending at least 250 milliseconds ahead of said reliable island and having a normalized deviation that is less than a predetermined value, said endpoint detector comparing said threshold values with said energy parameters to identify a beginning point and an ending point of said utterance; and a validity manager, responsive to said processor, for analyzing said speech energy according to selectable criteria to thereby verify said utterance. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for detecting endpoints of a spoken utterance, comprising:
-
analyzing speech energy corresponding to said spoken utterance;
calculating energy parameters in real time, said energy parameters corresponding to frames of said speech energy;
determining a starting threshold corresponding to a reliable island in said speech energy;
locating a starting point of said reliable island by comparing said energy parameters to said starting threshold;
performing a refinement procedure to identify a beginning point for said spoken utterance by calculating a beginning threshold corresponding to said spoken utterance, and comparing said energy parameters to said be ginning threshold to locate said beginning point of said spoken utterance, said beginning threshold Tsr being calculated according to a following equation;
- View Dependent Claims (7, 8)
-
Specification