Method in the recognition of speech and a wireless communication device to be controlled by speech
First Claim
1. A method for recognizing speech commands, in which a group of command words selectable by speech commands are defined, a time window is defined, within which the recognition of the speech command is performed, and a first recognition stage is performed, in which the recognition result of the first recognition stage is selected, characterized in that further in the method:
- a first confidence value is determined for the recognition result of the first recognition stage, a first threshold value (Y) is determined, said first confidence value is compared with said first threshold value (Y), if said first confidence value is greater than or equal to said first threshold value (Y), the recognition result of the first recognition stage is selected as the recognition result of the speech command, if said first confidence value is smaller than said first threshold value (Y), a second recognition stage is performed for the speech command, wherein said time window is extended, and a second confidence value is determined for the recognition result of the second recognition stage, said second confidence value is compared with said threshold value (Y), if said second confidence value is greater than or equal to said first threshold value (Y), the command word selected at the second stage is selected as the recognition result for the speech command, if said second confidence value is smaller than said first threshold value (Y), a comparison stage is performed, wherein feature vectors obtained from the first and second recognition stages are compared to each other to determine a probability that they are substantially the same, wherein if the probability exceeds a predetermined value, the command word selected at the second stage is selected as the recognition result for the speech command.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for recognizing speech commands includes defining a time window for recognizing a speech command. A first confidence value is determined for a first recognition result. If the first confidence value is greater than or equal to a first threshold value, the first recognition result is selected as the speech command, otherwise a second recognition stage is performed where the time window is extended. A second confidence value is determined for the second recognition result. If the second confidence value is greater than or equal to the first threshold value, a command word selected at the second recognition stage is selected as the speech command, otherwise the first and second recognition results are compared to each other to determine a probability that they are substantially the same. If the probability exceeds a predetermined value, the command word selected at the second stage is selected as the speech command.
57 Citations
14 Claims
-
1. A method for recognizing speech commands, in which a group of command words selectable by speech commands are defined, a time window is defined, within which the recognition of the speech command is performed, and a first recognition stage is performed, in which the recognition result of the first recognition stage is selected, characterized in that further in the method:
-
a first confidence value is determined for the recognition result of the first recognition stage, a first threshold value (Y) is determined, said first confidence value is compared with said first threshold value (Y), if said first confidence value is greater than or equal to said first threshold value (Y), the recognition result of the first recognition stage is selected as the recognition result of the speech command, if said first confidence value is smaller than said first threshold value (Y), a second recognition stage is performed for the speech command, wherein said time window is extended, and a second confidence value is determined for the recognition result of the second recognition stage, said second confidence value is compared with said threshold value (Y), if said second confidence value is greater than or equal to said first threshold value (Y), the command word selected at the second stage is selected as the recognition result for the speech command, if said second confidence value is smaller than said first threshold value (Y), a comparison stage is performed, wherein feature vectors obtained from the first and second recognition stages are compared to each other to determine a probability that they are substantially the same, wherein if the probability exceeds a predetermined value, the command word selected at the second stage is selected as the recognition result for the speech command. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A speech recognition device, in which a vocabulary of selectable command words is defined, the device comprising means (5) for measuring the time used for recognition and comparing it with a predetermined time window, and means (3, 4, 5) for selecting a first recognition result, characterized in that the speech recognition device further comprises:
-
means (3, 5) for calculating a first confidence value for said first recognition result, means (5) for comparing said first confidence value with a predetermined first threshold value (Y), wherein the first recognition result is arranged to be selected as a final recognition result, if said first confidence value is greater than or equal to said first threshold value (Y), and means (5) for performing the recognition stage of a second speech command, if said first confidence value is smaller than said first threshold value (Y), the means for performing the recognition stage of the second speech command comprising;
means (5) for extending said time window, means (3, 4, 5) for selecting a second recognition result of the recognition stage of the second speech command, means (5) for calculating a second confidence value for said second recognition result, means (5) for comparing said second confidence value with the predetermined first threshold value (Y), wherein the second recognition result is arranged to be selected as the final recognition result, if said second confidence value is greater than or equal to said first threshold value (Y), and means (3, 4, 5) for performing a comparison stage, the comparison stage being arranged to be performed if said second confidence value is smaller than said first threshold value (Y), wherein the means for performing a comparison stage includes a device for comparing feature vectors obtained at the first and second recognition stages to each other to determine a probability that they are substantially the same, wherein if the probability exceeds a predetermined value, the second recognition result is selected as the final recognition result. - View Dependent Claims (8)
-
-
9. A wireless communication device comprising means for recognizing speech commands, in which a vocabulary of selectable command words is defined, the means for recognizing speech commands comprising means (5) for measuring the time used for recognition and comparing it with a predetermined time window, and means (3, 4, 5) for selecting a first recognition result, characterized in that the means for recognizing speech commands further comprises:
-
means (3, 5) for calculating a first confidence value for said first recognition result, means (5) for comparing said first confidence value with a predetermined first threshold value (Y), wherein the first recognition result is arranged to be selected as a final recognition result, if said first confidence value is greater than or equal to said first threshold value (Y), and means (5) for performing the recognition stage of a second speech command, if said first confidence value is smaller than said first threshold value (Y), the means for performing the recognition stage of the second speech command comprising;
means (5) for extending said time window, means (3, 4, 5) for selecting a second recognition result of the recognition stage of the second speech command, means (5) for calculating a second confidence value for said second recognition result, means (5) for comparing said second confidence value with the predetermined first threshold value (Y), wherein the second recognition result is arranged to be selected as the final recognition result, if said second confidence value is greater than or equal to said first threshold value (Y), and means (3, 4, 5) for performing a comparison stage, the comparison stage being arranged to be performed if said second confidence value is smaller than said first threshold value (Y), wherein the means for performing a comparison stage includes a device for comparing feature vectors obtained from the first and second recognition stages to each other to determine a probability that they are substantially the same, wherein if the probability exceeds a predetermined value, the second recognition is selected as the final recognition result.
-
-
10. A method for recognizing speech commands comprising:
-
calculating a first confidence value for a first utterance;
accepting the first utterance as a command word if the first confidence value equals or exceeds a threshold;
calculating a second confidence value for a second utterance if the first confidence value is less than the threshold;
accepting the second utterance as a command word if the second confidence value equals or exceeds the threshold; and
if the second confidence value is less than the threshold, accepting the second utterance as a command word if the first and second utterances are compared to each other and determined to be substantially the same. - View Dependent Claims (11, 12)
calculating a probability for each of a plurality of command words that the individual command word is the utterance; and
comparing the probability of the individual command word having the greatest probability with a probability produced by a background noise model.
-
-
12. The method of claim 10, wherein determining if the first and second utterances are substantially the same comprises comparing feature vectors of the first and second utterances utilizing time warping.
-
13. A method for recognizing one or more utterances, in which a group of command words selectable by utterances are defined, a time window is defined, within which the recognition of an utterance is performed, and a first recognition stage is performed, the method further comprising:
-
determining a first confidence value for a first utterance;
determining a first threshold value;
comparing the first confidence value to the first threshold value;
selecting the first utterance as one of the group of command words if the first confidence value is greater than or equal to the first threshold value;
performing a second recognition stage if the first confidence value is smaller than the first threshold value, the second recognition stage comprising;
extending the time window;
determining a second confidence value for a second utterance;
comparing the second confidence value to the first threshold value;
accepting the second utterance as one of the group of command words if the second confidence value is greater than or equal to the first threshold value;
performing a comparison stage if the second confidence value is smaller than the first threshold value, the comparison stage comprising;
comparing the first and second utterances to each other to determine a probability that they are substantially the same, and selecting the second utterance as one of the group of command words if the probability exceeds a predetermined value.
-
-
14. A speech recognition device, in which a vocabulary of selectable command words is defined, the device comprising means for measuring the time used for recognition and comparing it with a predetermined time window, the speech recognition device further comprising:
-
means for calculating a first confidence value for a first utterance;
means for comparing the first confidence value with a predetermined first threshold value and selecting the first utterance as a command word if the first confidence value is greater than or equal to the first threshold value; and
means for recognizing a second utterance if the first confidence value is smaller than the first threshold value, the means for recognizing the second utterance comprising;
means for extending the time window for recognizing the second utterance;
means for calculating a second confidence value for the second utterance;
means for comparing the second confidence value with the predetermined first threshold value and selecting the second utterance as a command word if the second confidence value is greater than or equal to the first threshold value, and means for performing a comparison stage if the second confidence value is smaller than the first threshold value, wherein the means for performing a comparison stage includes a device for comparing the first and second utterances to each other to determine a probability that they are substantially the same, wherein if the probability exceeds a predetermined value, the second utterance is selected as a command word.
-
Specification