Learning intended user actions
First Claim
1. A method, comprisingreceiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances;
- parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts;
recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances, said recognizing step comprising sequentially comparing each of the verb parts and each of the noun parts from the user utterances both individually and as pairs to the verb parts and the noun parts of the sample utterances;
tracking words and word pairs used in conjunction with one or more recognized gestures, and determining a frequency of accepted and rejected system actions based on the tracking; and
selectively performing a given one of the user commands responsive to a recognition result.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system are provided. The method includes receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances. The method further includes parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts. The method also includes recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances. The recognizing step includes comparing the verb parts and the noun parts from the user utterances individually and as pairs to the verb parts and the noun parts of the sample utterances. The method additionally includes selectively performing a given one of the user commands responsive to a recognition result.
-
Citations
17 Claims
-
1. A method, comprising
receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances; -
parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts; recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances, said recognizing step comprising sequentially comparing each of the verb parts and each of the noun parts from the user utterances both individually and as pairs to the verb parts and the noun parts of the sample utterances; tracking words and word pairs used in conjunction with one or more recognized gestures, and determining a frequency of accepted and rejected system actions based on the tracking; and selectively performing a given one of the user commands responsive to a recognition result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
Specification