COMPOUND GESTURE-SPEECH COMMANDS
First Claim
1. A method for controlling a computing system using a set of voice commands, comprising:
- displaying one or more objects on a display monitor;
receiving body position data from a sensor;
recognizing a gesture in relation to the one or more objects based on the received body position data;
choosing a subset of the set of sound commands based on the recognized gesture, the set of sound commands includes multiple subsets, each subset is associated with one or more gestures and sound command recognition data for the respective subset;
loading sound command recognition data for the chosen subset of sound commands;
receiving sound input from a microphone;
recognizing a sound command from the sound input using the loaded sound command recognition data; and
performing an action in response to the recognized sound command.
2 Assignments
0 Petitions
Accused Products
Abstract
A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user'"'"'s body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.
130 Citations
20 Claims
-
1. A method for controlling a computing system using a set of voice commands, comprising:
-
displaying one or more objects on a display monitor; receiving body position data from a sensor; recognizing a gesture in relation to the one or more objects based on the received body position data; choosing a subset of the set of sound commands based on the recognized gesture, the set of sound commands includes multiple subsets, each subset is associated with one or more gestures and sound command recognition data for the respective subset; loading sound command recognition data for the chosen subset of sound commands; receiving sound input from a microphone; recognizing a sound command from the sound input using the loaded sound command recognition data; and performing an action in response to the recognized sound command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An interface system for controlling a multimedia system, comprising:
-
a monitor for displaying multimedia content; a sensor for capturing user gestures; a microphone for capturing user sounds; and a computer connected to the sensor, the microphone and the monitor, the computer driving the monitor to display a group of objects, the computer receives image data representing a gesture from the sensor, the computer recognizes the gesture as selecting a first object from the group of objects, the computer updates the monitor to display a first contextual menu that shows a subset of sound commands that may be used with regard to the first object, the computer receives sound data representing a sound command from the microphone, the computer recognizes the sound command as being from the subset of sound commands, the sound command indicates a desired action with regard to the first object, the computer executes the desired action. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A processor readable storage device having instructions encoded thereon, the instructions for programming one or more processors to perform a method for controlling a multimedia system, comprising:
-
displaying a group of one or more objects on a monitor; receiving body position data from a sensor; recognizing a gesture from the received body position data; updating the monitor display to list a set of sound commands available in response to the recognized gesture; receiving sound data from a microphone; recognizing a sound command from the set of sound commands based on the received sound data; and executing an action associated with the recognized sound command. - View Dependent Claims (18, 19, 20)
-
Specification