Predictive determination

US 7,971,157 B2
Filed: 06/30/2010
Issued: 06/28/2011
Est. Priority Date: 01/30/2009
Status: Active Grant

First Claim

Patent Images

1. A method for predicting a gesture made by a user to a first application, comprising:

receiving image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user;

applying a filter to the image data to interpret the gesture, wherein the sound data at least one of;

augments, distinguishes or clarifies the gesture and wherein the filter comprises a first parameter about the gesture and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;

determining, from the applied filter, an output corresponding to the gesture being performed, wherein determining the output includes determining the output corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and

sending the first application the output.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods and computer readable media are disclosed for a gesture recognizer system architecture. A recognizer engine is provided, which receives user motion data and provides that data to a plurality of filters. A filter corresponds to a gesture, that may then be tuned by an application receiving information from the gesture recognizer so that the specific parameters of the gesture—such as an arm acceleration for a throwing gesture—may be set on a per-application level, or multiple times within a single application. Each filter may output to an application using it a confidence level that the corresponding gesture occurred, as well as further details about the user motion data.

49 Citations

View as Search Results

20 Claims

1. A method for predicting a gesture made by a user to a first application, comprising:
- receiving image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user;
  
  applying a filter to the image data to interpret the gesture, wherein the sound data at least one of;
  
  augments, distinguishes or clarifies the gesture and wherein the filter comprises a first parameter about the gesture and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;
  
  determining, from the applied filter, an output corresponding to the gesture being performed, wherein determining the output includes determining the output corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and
  
  sending the first application the output.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein determining, from the applied filter, an output corresponding to the gesture being performed includes:
    - determining that future received data will correspond to the gesture being performed based on a correlation between the image and sound data and a prior data that corresponds to the gesture being completed.
  - 3. The method of claim 1, wherein determining, from the applied filter, an output corresponding to the gesture being performed includes:
    - determining that future received data will correspond to the gesture being performed based on a physiology of the user.
  - 4. The method of claim 1, wherein the output comprises a confidence level that the data is indicative of the gesture being performed.
  - 5. The method of claim 1, wherein the output comprises an indication that the filter is being predicted.
  - 6. The method of claim 1, further comprising:
    - receiving second sound and second image data;
      
      applying the filter to the second image data to interpret the gesture, wherein the second sound data at least one of;
      
      augments, distinguishes or clarifies the gesture;
      
      determining from the applied filter that the gesture has been performed;
      
      determining a second output based on the applied filter; and
      
      sending the first application the second output.
  - 7. The method of claim 6, wherein the output comprises an indication that the filter is being predicted and the second output comprises an indication that the filter has been completed.
  - 8. The method of claim 1, further comprising:
    - receiving second sound and second image data;
      
      applying the filter to the second image data to interpret the gesture, wherein the second sound data at least one of;
      
      augments, distinguishes or clarifies the gesture;
      
      determining from the applied filter that the gesture was not performed;
      
      determining a second output based on the applied filter, the second output indicating that the gesture was not performed; and
      
      sending the first application the second output.
  - 9. The method of claim 1, further comprising:
    - receiving a parameter from the first application, the parameter defining a threshold of one of a volume of space, a velocity, a direction of movement, an angle, or a place where a movement occurs; and
      
      whereinapplying a filter to the image data to interpret the gesture includes applying the parameter to the image data to interpret the gesture.
  - 10. The method of claim 1, wherein the output corresponds to an output that would be determined when the gesture was completed.
  - 11. The method of claim 1, wherein the filter is linked to a previous filter, and determining, from the applied filter, an output corresponding to the gesture being performed, includes:
    - determining the output based on a prior output of the previous filter.
  - 12. The method of claim 1, further comprising:
    - applying a second filter to the image data to interpret the gesture, the second filter representing a second gesture and comprising base information about the second gesture;
      
      determining, from the applied second filter, a second output corresponding to the second gesture being performed and a context, the second output being indicative of a greater confidence level than the output; and
      
      sending the first application the second output.

13. A system for predicting a gesture made by a user to a first application, comprising:
- a processor;
  
  a gesture library comprising at least one filter to interpret the gesture; and
  
  a gesture recognizer engine that;
  
  receives image data captured by a camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user;
  
  determines an output from the filter based on the image data, wherein the sound data at least one of;
  
  augments, distinguishes or clarifies the gesture;
  
  sends the application the output before receiving data corresponding to the gesture being completed;
  
  applies a second filter to the image data to interpret the gesture, the second filter representing a second gesture and comprising base information about the second gesture;
  
  determines, from the applied second filter, a second output corresponding to the second gesture being performed and a context, the second output being indicative of a greater confidence level than the output; and
  
  sends the application the second output.
- View Dependent Claims (14, 15)
- - 14. The system of claim 13, wherein the gesture recognizer engine further:
    - sends the application at least one selected from the group consisting of a velocity of movement, a release point, a distance, and a body part that made the gesture.
  - 15. The system of claim 13, wherein the gesture recognizer engine further:
    - sends the application a plurality of outputs, each output corresponding to a filter of a plurality of filters.

16. A computer readable storage medium, comprising computer readable instructions that when executed on a processor, cause the processor to perform the operations of:
- receiving from a first application of the a plurality of applications a value for at least one parameter;
  
  receiving image data captured by a depth camera and sound data captured by a microphone, wherein the image data is representative of a gesture performed by the user and the sound data is representative of a sound made by the user;
  
  applying the filter to the image data to interpret the gesture, wherein the sound data at least one of;
  
  augments, distinguishes or clarifies the gesture, and wherein the filter comprises a first parameter and a second parameter about the gesture, the first parameter corresponding to an earlier part of the gesture than the second parameter;
  
  determining a confidence level that the image and sound data is indicative of the at least one gesture, wherein determining the confidence level includes determining the confidence level corresponds to a high confidence level when the first parameter corresponds to a high confidence level and the second parameter does not correspond to a high confidence level; and
  
  sending the first application an indication of at least one gesture with its associated confidence level.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The computer-readable storage medium of claim 16, further comprising:
    - applying a second filter to the image data to interpret the gesture;
      
      determining, from the applied second filter, a confidence level that the image and sound data is indicative of at least one gesture, the second confidence level being greater than the confidence level; and
      
      sending the first application the second confidence level.
  - 18. The computer-readable storage medium of claim 16, wherein determining a confidence level that the image and sound data is indicative of at least one gesture includes:
    - determining that future received data will be indicative of at least one gesture based on a correlation between the image and sound data and prior image and sound data that is indicative of the gesture.
  - 19. The computer-readable storage medium of claim 16, wherein determining a high confidence level that the image and sound data is indicative of the at least one gesture based on the first parameter corresponding to a high confidence level and the second parameter not corresponding to a high confidence level comprises:
    - determining that future received data will correspond to the gesture being performed based on a correlation between the image and sound data and a prior data that corresponds to the gesture being completed.
  - 20. The computer-readable storage medium of claim 16, wherein determining a high confidence level that the image and sound data is indicative of the at least one gesture based on the first parameter corresponding to a high confidence level and the second parameter not corresponding to a high confidence level comprises:
    - determining that future received data will correspond to the gesture being performed based on a physiology of the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
McBride, Justin, Snook, Gregory N., Markovic, Relja
Primary Examiner(s)
HAILU, TADESSE

Application Number

US12/828,108
Publication Number

US 20100266210A1
Time in Patent Office

363 Days
Field of Search

715/863, 715/849, 715/862, 715/866, 463/31, 463/36, 463/5, 345/156, 345/158, 345/473, 345/474
US Class Current

715/863
CPC Class Codes

G06F 3/017 Gesture based interaction, ...

H04N 7/18 Closed-circuit television [...

Predictive determination

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

49 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Predictive determination

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

49 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links