Compound gesture-speech commands
First Claim
1. A method for controlling a computing system, comprising:
- displaying one or more objects on a display monitor;
receiving body position data from a sensor;
recognizing a gesture in relation to the one or more objects based on the received body position data;
choosing a subset of a set of sound commands based on the recognized gesture, the set of sound commands includes multiple subsets, each subset is associated with one or more gestures and sound command recognition data for the respective subset;
loading sound command recognition data for the chosen subset of sound commands;
receiving sound input from a microphone;
recognizing a sound command from the sound input;
correlating the recognized gesture with the recognized sound command based on a weighted confidence value associated with the recognized gesture and a weighted confidence value associated with the recognized sound command, said correlating the recognized gesture with the recognized sound command includes;
selecting a subset of the set sound commands associated with the recognized gesture to verify the recognized sound command if the weighted confidence value associated with the recognized gesture is higher than the weighted confidence value associated with the recognized sound command;
selecting a subset of gestures associated with the recognized sound command to verify the recognized gesture if the weighted confidence value associated with the recognized sound command is higher than the weighted confidence value associated with the recognized gesture; and
performing an action in response to the recognized sound command.
2 Assignments
0 Petitions
Accused Products
Abstract
A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user'"'"'s body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.
-
Citations
20 Claims
-
1. A method for controlling a computing system, comprising:
-
displaying one or more objects on a display monitor; receiving body position data from a sensor; recognizing a gesture in relation to the one or more objects based on the received body position data; choosing a subset of a set of sound commands based on the recognized gesture, the set of sound commands includes multiple subsets, each subset is associated with one or more gestures and sound command recognition data for the respective subset; loading sound command recognition data for the chosen subset of sound commands; receiving sound input from a microphone; recognizing a sound command from the sound input; correlating the recognized gesture with the recognized sound command based on a weighted confidence value associated with the recognized gesture and a weighted confidence value associated with the recognized sound command, said correlating the recognized gesture with the recognized sound command includes; selecting a subset of the set sound commands associated with the recognized gesture to verify the recognized sound command if the weighted confidence value associated with the recognized gesture is higher than the weighted confidence value associated with the recognized sound command; selecting a subset of gestures associated with the recognized sound command to verify the recognized gesture if the weighted confidence value associated with the recognized sound command is higher than the weighted confidence value associated with the recognized gesture; and performing an action in response to the recognized sound command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An interface system for controlling a multimedia system, comprising:
-
a monitor for displaying multimedia content; a sensor for capturing user gestures; a microphone for capturing user sounds; and a computer connected to the sensor, the microphone and the monitor, the computer driving the monitor to display a group of objects, the computer receives image data representing a gesture from the sensor, the computer recognizes the gesture as selecting a first object from the group of objects, the computer updates the monitor to display a first contextual menu that shows a subset of sound commands that may be used with regard to the first object, the computer receives sound data from the microphone, the computer recognizes a sound command as being from the subset of sound commands based on the received sound data, the sound command indicates a desired action with regard to the first object, the computer executes the desired action, the computer correlates the recognized gesture with the recognized sound command based on a weighted confidence value associated with the recognized gesture and a weighted confidence value associated with the recognized sound command, said computer correlating the recognized gesture with the recognized sound command includes selecting a subset of sound commands associated with the recognized gesture to facilitate the recognition of the sound command if the weighted confidence value associated with the recognized gesture is higher than the weighted confidence value associated with the recognized sound command, said computer correlating the recognized gesture with the recognized sound command includes selecting a subset of gestures associated with the recognized sound command to facilitate the recognition of the gesture if the weighted confidence value associated with the recognized sound command is higher than the weighted confidence value associated with the recognized gesture. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A processor readable storage device having instructions encoded thereon, the instructions for programming one or more processors to perform a method for controlling a multimedia system, comprising:
-
displaying a group of one or more objects on a monitor; receiving body position data from a sensor; recognizing a gesture from the received body position data; updating the monitor display to list a set of sound commands available in response to the recognized gesture; receiving sound data from a microphone; recognizing a sound command from the set of sound commands based on the received sound data; selecting a set of sound commands associated with the recognized gesture to confirm that the sound command is properly recognized if a weighted confidence value associated with the recognized gesture is higher than a weighted confidence value associated with the recognized sound command; and executing an action associated with the recognized sound command. - View Dependent Claims (18, 19, 20)
-
Specification