System and method for gesture recognition in three dimensions using stereo imaging and color vision

US 6,788,809 B1
Filed: 06/30/2000
Issued: 09/07/2004
Est. Priority Date: 06/30/2000
Status: Active Grant

First Claim

Patent Images

1. A method for recognizing gestures comprising:

obtaining an image data;

determining a hand pose estimation based on computing a center of the hand, computing an orientation of the hand in relation to a camera reference frame, performing background subtraction, determining an arm orientation, and computing the hand pose estimation based on the arm orientation;

producing a frontal view of a hand;

isolating the hand from the background; and

classifying a gesture of the hand;

wherein computing a center of the hand includes defining a hand region as a cylinder centered along the 3D line with dimensions large enough to include a typical hand, selecting pixels from within the hand region as hand pixels, and averaging the location of all of the hand pixels.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for recognizing gestures. The method comprises obtaining image data and determining a hand pose estimation. A frontal view of a hand is then produced. The hand is then isolated the background. The resulting image is then classified as a type of gesture. In one embodiment, determining a hand pose estimation comprises performing background subtraction and computing a hand pose estimation based on an arm orientation determination. In another embodiment, a frontal view of a hand is then produced by performing perspective unwarping and scaling. The system that implements the method may be a personal computer with a stereo camera coupled thereto.

Citations

29 Claims

1. A method for recognizing gestures comprising:
- obtaining an image data;
  
  determining a hand pose estimation based on computing a center of the hand, computing an orientation of the hand in relation to a camera reference frame, performing background subtraction, determining an arm orientation, and computing the hand pose estimation based on the arm orientation;
  
  producing a frontal view of a hand;
  
  isolating the hand from the background; and
  
  classifying a gesture of the hand;
  
  wherein computing a center of the hand includes defining a hand region as a cylinder centered along the 3D line with dimensions large enough to include a typical hand, selecting pixels from within the hand region as hand pixels, and averaging the location of all of the hand pixels.
- View Dependent Claims (2, 7, 8, 9)
- - 2. The method of claim 1 wherein computing an orientation of the hand further comprises:
7. The method of claim 1 wherein isolating the hand comprises:
- initializing a hand color probability density function; and
  
  refining the hand color probability density function.
8. The method of claim 7 wherein initializing comprises:
- using the hand pixels to initialize and evaluate the hue-saturation histogram of the hand color.
9. The method of claim 8 wherein refining comprises:
- choosing a part of a color space that contains a majority of the hand pixels to define a hand color;
  
  selecting those pixels in the image surrounding the hand that are of a color corresponding to the hand color;
  
  discarding the hand pixels which are not of the color corresponding to the hand color.

3. A method for recognizing gestures comprising:
- obtaining an image data;
  
  determining a hand pose estimation;
  
  producing a frontal view of a hand based on performing perspective unwarping to produce an unwarped frontal view of the hand and scaling the unwarped frontal view of the hand into a template image;
  
  isolating the hand from the background; and
  
  classifying a gesture of the hand.
- View Dependent Claims (4, 5, 6, 10, 22, 23, 24, 29)
- - 4. The method of claim 3 wherein performing perspective unwarping comprises:
5. The method of claim 4 wherein mathematically moving the virtual camera comprises:
- rotating the virtual camera to align a reference frame of the virtual camera with a reference frame of the hand;
  
  translating the virtual camera to a fixed distance from the orientation of the hands.
6. The method of claim 3 wherein scaling comprises:
- choosing a fixed correspondence between the dimensions of the template image and the dimensions of a typical hand.
10. The method of claim 3 wherein classifying a gesture comprises:
- matching the hand template against a plurality of gesture templates.
22. The method of claim 3 wherein the image data comprises:
- a color image and a depth data.
23. The method of claim 22 wherein the color image comprises a red value, a green value and a blue value for each pixel of a captured image, and the depth data comprises an x value in a camera reference frame, a y value in the camera reference frame, and a z value in the camera reference frame for each pixel of the captured image.
24. The method of claim 3 wherein determining a hand pose estimation comprises:
- performing background subtraction;
  
  determining an arm orientation; and
  
  computing the hand pose estimation based on the arm orientation.
29. The method of claim 24 wherein computing the hand pose estimation comprises:
- computing a center of the hand; and
  
  computing an orientation of the hand in relation to a camera reference frame.

11. A method for recognizing gestures comprising:
- obtaining an image data;
  
  determining a hand pose estimation;
  
  producing a frontal view of a hand;
  
  isolating the hand from the background;
  
  classifying a gesture of the hand; and
  
  matching the hand template against a plurality of gesture templates based on computing geometric moments of a first order, a second order and a third order; and
  
  applying a Mahalanobis distance metric.

12. A method for recognizing gestures comprising:
- obtaining an image data;
  
  performing background subtraction;
  
  computing a hand pose estimation based on an arm orientation determination;
  
  performing perspective unwarping to produce an unwarped frontal view of a hand;
  
  scaling the unwarped frontal view of the hand into a template image;
  
  isolating the hand from the background using color segmentation; and
  
  classifying a gesture of the hand by matching the hand with a plurality of template hand images.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The method of claim 12 wherein the image data comprises:
14. The method of claim 13 wherein performing background subtraction comprises:
- selecting as a foreground arm image those pixels of the depth data where the difference between a mean background depth and the current depth is larger than an empirically defined threshold.
15. The method of claim 14 wherein determining an arm orientation comprises:
- computing a three-dimensional (3D) line that defines the arm orientation by fitting a first two dimensional (2D) line to the image data in the image plane and fitting a second 2D line to the image data in the depth dimension in the plane containing the first 2D line such that the second 2D line is perpendicular to the viewing plane.
16. The method of claim 15 wherein computing the hand pose estimation comprises:
- computing a center of the hand; and
  
  computing an orientation of the hand in relation to the camera reference frame.

17. A system comprising:
- a stereo camera coupled to a computer, the computer comprising a processor and a storage device to read from a machine readable medium, the machine readable medium containing instructions which, when executed by the processor, cause the computer to perform operations comprising;
  
  obtaining an image data;
  
  performing background subtraction;
  
  computing a hand pose estimation based on an arm orientation determination;
  
  performing perspective unwarping to produce an unwarped frontal view of a hand;
  
  scaling the unwarped frontal view of the hand into a template image;
  
  isolating the hand from the background using color segmentation; and
  
  classifying a gesture of the hand by matching the hand with a plurality of template hand images.
- View Dependent Claims (18, 19, 20, 21)
- - 18. The system of claim 17 wherein the image data comprises:
19. The system of claim 17 wherein performing background subtraction comprises:
- selecting as a foreground arm image those pixels of the depth data where the difference between a mean background depth and the current depth is larger than an empirically defined threshold.
20. The system of claim 17 wherein determining an arm orientation comprises:
- computing a three-dimensional (3D) line that defines the arm orientation by fitting a first two dimensional (2D) line to the image data in the image plane and fitting a second 2D line to the image data in the depth dimension in the plane containing the first 2D line such that the second 2D line is perpendicular to the viewing plane.
21. The system of claim 17 wherein computing the hand pose estimation comprises:
- computing a center of the hand; and
  
  computing an orientation of the hand in relation to a camera reference frame.

25. A method for recognizing gestures comprising:
- obtaining an image data;
  
  determining a hand pose estimation based on performing background subtraction, determining an arm orientation, and computing the hand pose estimation based on the arm orientation, wherein performing background subtraction includes selecting as a foreground arm image those pixels where the difference between a mean background depth and the current depth is larger than an empirically defined threshold;
  
  producing a frontal view of a hand;
  
  isolating the hand from the background; and
  
  classifying a gesture of the hand.

26. A method for recognizing gestures comprisingobtaining an image data;
- determining a hand pose estimation based on performing background subtraction, determining an arm orientation, and computing the hand pose estimation based on the arm orientation, wherein determining an arm orientation includes fitting a first two dimensional (2D) line to the image data in the image plane;
  
  fitting a second 2D line to the image data in the depth dimension in the plane containing the first 2D line such that the second 2D line is perpendicular to the viewing plane, and combining the first 2D line and the second 2D line into a three-dimensional (3D) line such that the 3D line defines the arm orientation;
  
  producing a frontal view of a hand;
  
  isolating the hand from the background; and
  
  classifying a gesture of the hand.
- View Dependent Claims (27, 28)
- - 27. The method of claim 26 wherein fitting a first 2D line comprises:
28. The method of claim 27 wherein employing an iterative reweighted least square method comprises:
- using a Welsch M-estimator.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Bouget, Jean-Yves, Chu, Michael H., Grzeszczuk, Radek, Bradski, Gary R.
Primary Examiner(s)
AHMED, SAMIR ANWAR

Application Number

US09/607,820
Time in Patent Office

1,530 Days
Field of Search

382/100, 382/103, 382/154, 382/155, 382/159, 382/164, 382/165, 382/173, 382/181, 382/276, 235/419-427, 356/12, 356/21, 356/22, 348/77, 348/169, 345/419, 345/418, 345/427
US Class Current

382/154
CPC Class Codes

G06F 3/017   Gesture based interaction, ...

G06V 20/64   Three-dimensional objects

G06V 40/107   Static hand or arm

System and method for gesture recognition in three dimensions using stereo imaging and color vision

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for gesture recognition in three dimensions using stereo imaging and color vision

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links