Gesture recognition using depth images

US 8,718,327 B2
Filed: 04/11/2011
Issued: 05/06/2014
Est. Priority Date: 04/11/2011
Status: Active Grant

First Claim

Patent Images

1. An apparatus, comprising:

a face detection engine configured to determine whether a face is present in one or more gray images of respective image frames generated by a depth camera;

a hand tracking engine coupled to the face detection engine, and configured to track a hand in one or more depth images generated by the depth camera, on determination by the face detection engine that a face is present in the one or more gray images; and

a feature extraction and gesture inference engine coupled to the hand tracking engine, and configured to extract features based on results of the tracking by the hand tracking engine, and infer a hand gesture based at least in part on the extract features;

wherein either the face detection engine or the hand tracking engine is further configured to determine a measure of a distance between the face and the camera, using the one or more depth images.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, apparatuses, and articles associated with gesture recognition using depth images are disclosed herein. In various embodiments, an apparatus may include a face detection engine configured to determine whether a face is present in one or more gray images of respective image frames generated by a depth camera, and a hand tracking engine configured to track a hand in one or more depth images generated by the depth camera. The apparatus may further include a feature extraction and gesture inference engine configured to extract features based on results of the tracking by the hand tracking engine, and infer a hand gesture based at least in part on the extracted features. Other embodiments may also be disclosed and claimed.

15 Citations

View as Search Results

26 Claims

1. An apparatus, comprising:
- a face detection engine configured to determine whether a face is present in one or more gray images of respective image frames generated by a depth camera;
  
  a hand tracking engine coupled to the face detection engine, and configured to track a hand in one or more depth images generated by the depth camera, on determination by the face detection engine that a face is present in the one or more gray images; and
  
  a feature extraction and gesture inference engine coupled to the hand tracking engine, and configured to extract features based on results of the tracking by the hand tracking engine, and infer a hand gesture based at least in part on the extract features;
  
  wherein either the face detection engine or the hand tracking engine is further configured to determine a measure of a distance between the face and the camera, using the one or more depth images.
- View Dependent Claims (2, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The apparatus of claim 1, wherein the face detection engine is configured to analyze the gray images using a Haar-Cascade model, to determine whether a face is present in the one or more gray images.
  - 4. The apparatus of claim 1, wherein the hand tracking engine is further configured to select respective regions of the depth images that are size-wise smaller than the one or more depth images, to track the hand, based at least in part on the determined distance between the face and the camera.
  - 5. The apparatus of claim 4, wherein the hand tracking engine is further configured to determine location measures of the hand.
  - 6. The apparatus of claim 5, wherein the hand tracking engine is configured to determine the location measures in terms of a pair of (x, y) coordinates for a center of the hand for respective ones of the one or more depth images, using mean-shift filtering that uses gradients of probabilistic density.
  - 7. The apparatus of claim 1, wherein the feature extraction and gesture inferring engine is configured to extract one or more of an eccentricity measure, a compactness measure, an orientation measure, a rectangularity measure, a horizontal center measure, a vertical center measure, a minimum bounding box angle measure, a minimum bounding box width-to-height ratio measure, a difference between left-and-right measure or a difference between up-and-down measure.
  - 8. The apparatus of claim 1, wherein the feature extraction and gesture inferring engine is configured to infer one of an open hand gesture, a closed hand fist gesture, a thumb up gesture, a thumb down gesture, a thumb left gesture or a thumb right gesture, based on the extracted features.
  - 9. The apparatus of claim 1, wherein the feature extraction and gesture inference engine is further configured to notify an application of the inferred hand gesture.
  - 10. The apparatus of claim 1, further comprising the camera.
  - 11. The apparatus of claim 1, where the apparatus is a selected one of a desktop computer, a laptop computer, a tablet computer, a server, a smart phone, a personal digital assistant, a game console, or a set-top box.

3. A method comprising:
- determining by a computing apparatus, whether a face is present in one or more gray images of respective image frames generated by a depth camera;
  
  tracking, by the computing apparatus, a hand in selected respective regions of one or more depth images generated by the depth camera, on determination that a face is present in the one or more gray images, wherein the selected respective regions are size-wise smaller than the respective one or more depth images; and
  
  inferring a hand gesture, by the computing device, based at least in part on a result of the tracking;
  
  wherein tracking comprises determining location measures of the hand for the depth images.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The method of claim 3, wherein determining location measures of the hand for the depth images comprises determining a pair of (x, y) coordinates for a center of the hand, using mean-shift filtering that uses gradients of probabilistic density.
  - 13. The method of claim 3, wherein inferring comprises extracting one or more features for the selected respective regions, based at least in part on a result of the tracking, and inferring a hand gesture based at least in part on the extracted one or more features.
  - 14. The method of claim 13, wherein extracting one or more features comprises extracting one or more of an eccentricity measure, a compactness measure, an orientation measure, a rectangularity measure, a horizontal center measure, a vertical center measure, a minimum bounding box angle measure, a minimum bounding box width-to-height ratio measure, a difference between left-and-right measure, or a difference between up-and-down measure.
  - 15. The method of claim 3, wherein inferring a gesture comprises inferring one of an open gesture, a fist gesture, a thumb up gesture, a thumb down gesture, a thumb left gesture or a thumb right gesture.

16. A method comprising:
- determining by a computing apparatus, whether a face is present in one or more gray images of respective image frames generated by a depth camera;
  
  tracking, by the computing apparatus, a hand in selected respective regions of one or more depth images generated by the depth camera, on determination that a face is present in the one or more gray images, wherein the selected respective regions are size-wise smaller than the respective one or more depth images;
  
  extracting, by the computing apparatus, one or more features from respective regions of the depth images; and
  
  inferring a hand gesture, by the computing apparatus, based at least in part on the one or more features extracted from the depth images;
  
  wherein extracting one or more features comprises extracting one or more of an eccentricity measure, a compactness measure, an orientation measure, a rectangularity measure, a horizontal center measure, a vertical center measure, a minimum bounding box angle measure, a minimum bounding box width-to-height ratio measure, a difference between left-and-right measure, or a difference between up-and-down measure.
- View Dependent Claims (17, 18)
- - 17. The method of claim 16, wherein extracting one or more features from respective regions of depth images comprises extracting one or more features from respective regions of depth images denoted as containing a hand.
  - 18. The method of claim 16 wherein inferring a gesture comprises inferring one of an open gesture, a fist gesture, a thumb up gesture, a thumb down gesture, a thumb left gesture or a thumb right gesture.

19. A computer-readable non-transitory storage medium, comprising:
- a plurality of programming instructions stored in the storage medium, and configured to cause an apparatus, in response to execution of the programming instructions by the apparatus, to perform operations including;
  
  determining whether a face is present in one or more gray images of respective image frames generated by a depth camera;
  
  tracking a hand in selected respective regions of one or more depth images generated by the depth camera, on determination that a face is present in the one or more gray images, wherein the selected respective regions are size-wise smaller than the respective one or more depth images; and
  
  inferring a hand gesture, based at least in part on a result of the tracking;
  
  wherein tracking comprises determining location measures of the hand for the depth images.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26)
- - 20. The storage medium of claim 19, wherein determining location measures of the hand for the depth images comprises determining a pair of (x, y) coordinates for a center of the hand, using mean-shift filtering that uses gradients of probabilistic density.
  - 21. The storage medium of claim 19, wherein inferring comprises extracting one or more features for the selected respective regions, based at least in part on a result of the tracking, and inferring a hand gesture based at least in part on the extracted one or more features.
  - 22. The storage medium of claim 21, wherein extracting one or more features comprises extracting one or more of an eccentricity measure, a compactness measure, an orientation measure, a rectangularity measure, a horizontal center measure, a vertical center measure, a minimum bounding box angle measure, a minimum bounding box width-to-height ratio measure, a difference between left-and-right measure, or a difference between up-and-down measure.
  - 23. The storage medium of claim 19, wherein inferring a gesture comprises inferring one of an open gesture, a fist gesture, a thumb up gesture, a thumb down gesture, a thumb left gesture or a thumb right gesture.
  - 24. The storage medium of claim 19, wherein the operations further comprise determining whether a face is present in the one or more depth images'"'"' corresponding one or more gray images of respective image frames generated by a depth camera.
  - 25. The storage medium of claim 24, wherein determining whether a face is present comprises analyzing the one or more gray images using a Haar-Cascade model.
  - 26. The storage medium of claim 24, wherein the operations further comprise determining a measure of a distance between the face and the camera, using the one or more depth images.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tahoe Research Limited (f/k/a Learndale Limited) (Vector Capital Corporation)
Original Assignee
Intel Corporation
Inventors
Tong, Xiaofeng, Ding, Dayong, Li, Wenlong, Zhang, Yimin
Primary Examiner(s)
AHMED, SAMIR ANWAR

Application Number

US13/387,221
Publication Number

US 20140037134A1
Time in Patent Office

1,121 Days
Field of Search

382/103, 382/154, 348/51, 715/863
US Class Current

382/103
CPC Class Codes

G06F 3/017   Gesture based interaction, ...

G06V 40/161   Detection; Localisation; No...

G06V 40/167   using comparisons between t...

G06V 40/28   Recognition of hand or arm ...

Gesture recognition using depth images

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

15 Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Gesture recognition using depth images

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links