Predictive determination
First Claim
1. A method for predicting a gesture made by a user to a first application, comprising:
- receiving image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user;
applying a filter to the image data to interpret the gesture, wherein the sound data at least one of;
augments, distinguishes or clarifies the gesture and wherein the filter comprises a first parameter about the gesture and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;
determining, from the applied filter, an output corresponding to the gesture being performed, wherein determining the output includes determining the output corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and
sending the first application the output.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods and computer readable media are disclosed for a gesture recognizer system architecture. A recognizer engine is provided, which receives user motion data and provides that data to a plurality of filters. A filter corresponds to a gesture, that may then be tuned by an application receiving information from the gesture recognizer so that the specific parameters of the gesture—such as an arm acceleration for a throwing gesture—may be set on a per-application level, or multiple times within a single application. Each filter may output to an application using it a confidence level that the corresponding gesture occurred, as well as further details about the user motion data.
49 Citations
20 Claims
-
1. A method for predicting a gesture made by a user to a first application, comprising:
-
receiving image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user; applying a filter to the image data to interpret the gesture, wherein the sound data at least one of;
augments, distinguishes or clarifies the gesture and wherein the filter comprises a first parameter about the gesture and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;determining, from the applied filter, an output corresponding to the gesture being performed, wherein determining the output includes determining the output corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and sending the first application the output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for predicting a gesture made by a user to a first application, comprising:
-
a processor; a gesture library comprising at least one filter to interpret the gesture; and a gesture recognizer engine that; receives image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user; determines an output from the filter based on the image data, wherein the sound data at least one of;
augments, distinguishes or clarifies the gesture;sends the application the output before receiving data corresponding to the gesture being completed; applies a second filter to the image data to interpret the gesture, the second filter representing a second gesture and comprising base information about the second gesture; determines, from the applied second filter, a second output corresponding to the second gesture being performed and a context, the second output being indicative of a greater confidence level than the output; and sends the application the second output. - View Dependent Claims (14, 15)
-
-
16. A computer readable storage medium, comprising computer readable instructions that when executed on a processor, cause the processor to perform the operations of:
-
receiving from a first application of the a plurality of applications a value for at least one parameter; receiving image data captured by a depth camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user; applying the filter to the image data to interpret the gesture, wherein the sound data at least one of;
augments, distinguishes or clarifies the gesture, and wherein the filter comprises a first parameter and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;determining a confidence level that the image and sound data is indicative of the at least one gesture, wherein determining the confidence level includes determining the confidence level corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and sending the first application an indication of at least one gesture with its associated confidence level. - View Dependent Claims (17, 18, 19, 20)
-
Specification