Frame comparison method for word recognition in high noise environments
First Claim
1. A method for comparing stored speech recognition templates which are framed into time segments and channelized into at least two channels which are frequency band-limited to an input signal which has been contaminated by high levels of noise and which is framed into a time segment and channelized into at least two channels which are frequency band-limited, comprising the steps of:
- determining, from the input signal, a first noise level associated with a first of the at least two channels and a second noise level associated with a second one of the at least two channels;
adding a buffering level to each of said first and second noise levels to create respective first and second buffered noise levels;
determining, from the input signal, a first signal level associated with a first of the at least two channels and a second signal level associated with a second of the at least two channels;
normalizing the level of each said first and second signal levels to create, respectively, normalized first and second signal levels;
normalizing a first channel stored speech recognition template and normalizing a second channel stored speech recognition template to create, respectively, first and second normalized template signal levels,subtracting said normalized first signal level from said normalized first template signal level to determine a first difference and subtracting said normalized second signal level from said normalized second template signal level to determine a second difference; and
generating a distance measure by at least adding together;
(a) the absolute value of said first difference if said first signal level is greater than said first buffered noise level, or said first difference if said first signal level is less than said first buffered noise level and said first difference is a positive value, or a predetermined nominal differential value if said first signal level is less than said first buffered noise level and said first difference is a negative value; and
(b) the absolute value of said second difference if said second signal level is greater than said second buffered noise level, or said second difference if said second signal level is less than said second buffered noise level and said second difference is a positive value, or a predetermined nominal differential value if said second signal level is less than said second buffered noise level and said second difference is a negative value.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and arrangement for a speech recognition system employs channel bank information to represent speech. The method considers background noise included with the speech. The method includes determining three energy levels for each channel the first representative of background noise energy, the second representative of the input frame energy and the third representative of the word template frame energy. Values representing energy level differentials are assigned at each channel. If the second energy level is less than the first energy level, then a predetermined constant value is assigned at the particular channel. These values are combined to generate a distance measure depicting the similarity between the two frames.
49 Citations
6 Claims
-
1. A method for comparing stored speech recognition templates which are framed into time segments and channelized into at least two channels which are frequency band-limited to an input signal which has been contaminated by high levels of noise and which is framed into a time segment and channelized into at least two channels which are frequency band-limited, comprising the steps of:
-
determining, from the input signal, a first noise level associated with a first of the at least two channels and a second noise level associated with a second one of the at least two channels; adding a buffering level to each of said first and second noise levels to create respective first and second buffered noise levels; determining, from the input signal, a first signal level associated with a first of the at least two channels and a second signal level associated with a second of the at least two channels; normalizing the level of each said first and second signal levels to create, respectively, normalized first and second signal levels; normalizing a first channel stored speech recognition template and normalizing a second channel stored speech recognition template to create, respectively, first and second normalized template signal levels, subtracting said normalized first signal level from said normalized first template signal level to determine a first difference and subtracting said normalized second signal level from said normalized second template signal level to determine a second difference; and generating a distance measure by at least adding together; (a) the absolute value of said first difference if said first signal level is greater than said first buffered noise level, or said first difference if said first signal level is less than said first buffered noise level and said first difference is a positive value, or a predetermined nominal differential value if said first signal level is less than said first buffered noise level and said first difference is a negative value; and (b) the absolute value of said second difference if said second signal level is greater than said second buffered noise level, or said second difference if said second signal level is less than said second buffered noise level and said second difference is a positive value, or a predetermined nominal differential value if said second signal level is less than said second buffered noise level and said second difference is a negative value. - View Dependent Claims (2, 3)
-
-
4. A word recognition detector which compares stored speech recognition templates which are framed into time segments and channelized into at least two channels which are frequency band-limited to an input signal which has been contaminated by high levels of noise and which is framed into a time segment and channelized into at least two channels which are frequency band-limited, the word recognition detector comprising:
-
means for determining, from the input signal, a first noise level associated with a first of the at least two channels and a second noise level associated with a second one of the at least two channels; means for adding a buffering level to each of said first and second noise levels to create respectively first and second buffered noise levels; means for determining, from the input signal, a first signal level associated with a first of the at least two channels and a second signal level associated with a second of the at least two channels; means for normalizing the level of each said first and second signal levels to create, respectively, normalized first and second signal levels; means for normalizing a first channel stored recognition template and means for normalizing a second channel stored speech recognition template to create, respectively, first and second normalized template signal levels; means for subtracting said normalized first signal level from said normalized first template signal level to determine a first difference and means for subtracting said normalized second signal level from said normalized second template signal level to determine a second difference; and means for generating a distance measure by adding together at least a first and a second addend, further comprising; (a) means for selecting a first addend as the absolute value of said first difference if said first signal level is greater than said first buffered noise level, or as said first difference if said first signal level is less than said first buffered noise level and if said first difference is a positive value, or as a predetermmined nominal differential value if said first signal level is less than said first buffered noise level and if said first difference is a negative value; and (b) means for selecting a second addend as the absolute value of said second difference if said second signal level is greater than said second buffered noise level, or as said second difference if said second signal level is less than said second buffered noise level and if said second difference is a positive value, or as a predetermined nominal differential value if said second signal level is less than said second buffered noise level and if said second difference is a negative value. - View Dependent Claims (5, 6)
-
Specification