Video hand image three-dimensional computer interface
First Claim
1. A method of tracking, in a real world scene, three-dimensional position coordinates and orientation angles of an object having an end portion, a tip and an axis, said method being implemented in a computer system having a first video acquisition device and a second video acquisition device for monitoring, from different positions, an identification zone defined in the real world scene, said method comprising the steps of:
- (a) acquiring a first image of the object from the first video acquisition device and simultaneously acquiring a second image of the object from the second video acquisition device when the object is present in the identification zone;
(b) determining a first set of pixel coordinates corresponding to the tip of the object and a first orientation angle for the end portion of the object from said first image and determining a second set of pixel coordinates corresponding to the tip of the object and a second orientation angle for the end portion of the object from said second image;
(c) determining a first virtual line in the real world scene in accordance with said first set of pixel coordinates and a first perspective projection matrix defined in accordance with a monitoring orientation of the first video acquisition device, and determining a second virtual line in the real world scene in accordance with said second set of pixel coordinates and a second perspective projection matrix defined in accordance with a monitoring orientation of the second video acquisition device;
(d) determining the three-dimensional position coordinates of the object end portion by identifying coordinates of a virtual intersection of said first and second virtual (e) determining a first parameter vector representative of a first linear projection of the object axis from said first image in accordance with said first set of pixel coordinates and said first orientation angle, and determining a second parameter vector representative of a second linear projection of the object axis from said second image in accordance with said second set of pixel coordinates and said second orientation angle;
(f) determining parameters of a first virtual plane using said first parameter vector and said first perspective projection matrix, and defining a second virtual plane using said second parameter vector and said second perspective projection matrix; and
(g) determining the three-dimensional orientation angles of the object by identifying orientation parameters of a third line defined by an intersection of said first and said second virtual planes.
2 Assignments
0 Petitions
Accused Products
Abstract
A video gesture-based three-dimensional computer interface system that uses images of hand gestures to control a computer and that tracks motion of the user'"'"'s hand or an elongated object or a portion thereof in a three-dimensional coordinate system with five degrees of freedom. The system includes a computer with image processing capabilities and at least two cameras connected to the computer. During operation of the system, hand images from the cameras are continually converted to a digital format and input to the computer for processing. The results of the processing and attempted recognition of each image are then sent to an application or the like executed by the computer for performing various functions or operations. However, when the computer recognizes a hand gesture as a “point” gesture with one finger extended, the computer uses information derived from the images to track three-dimensional coordinates of the extended finger of the user'"'"'s hand with five degrees of freedom. The computer utilizes two-dimensional images obtained by each camera to derive three-dimensional position (in an x, y, z coordinate system) and orientation (azimuth and elevation angles) coordinates of the extended finger.
258 Citations
14 Claims
-
1. A method of tracking, in a real world scene, three-dimensional position coordinates and orientation angles of an object having an end portion, a tip and an axis, said method being implemented in a computer system having a first video acquisition device and a second video acquisition device for monitoring, from different positions, an identification zone defined in the real world scene, said method comprising the steps of:
-
(a) acquiring a first image of the object from the first video acquisition device and simultaneously acquiring a second image of the object from the second video acquisition device when the object is present in the identification zone;
(b) determining a first set of pixel coordinates corresponding to the tip of the object and a first orientation angle for the end portion of the object from said first image and determining a second set of pixel coordinates corresponding to the tip of the object and a second orientation angle for the end portion of the object from said second image;
(c) determining a first virtual line in the real world scene in accordance with said first set of pixel coordinates and a first perspective projection matrix defined in accordance with a monitoring orientation of the first video acquisition device, and determining a second virtual line in the real world scene in accordance with said second set of pixel coordinates and a second perspective projection matrix defined in accordance with a monitoring orientation of the second video acquisition device;
(d) determining the three-dimensional position coordinates of the object end portion by identifying coordinates of a virtual intersection of said first and second virtual (e) determining a first parameter vector representative of a first linear projection of the object axis from said first image in accordance with said first set of pixel coordinates and said first orientation angle, and determining a second parameter vector representative of a second linear projection of the object axis from said second image in accordance with said second set of pixel coordinates and said second orientation angle;
(f) determining parameters of a first virtual plane using said first parameter vector and said first perspective projection matrix, and defining a second virtual plane using said second parameter vector and said second perspective projection matrix; and
(g) determining the three-dimensional orientation angles of the object by identifying orientation parameters of a third line defined by an intersection of said first and said second virtual planes. - View Dependent Claims (2, 3, 4, 5)
(h) prior to said step (a), calibrating the first and second video acquisition devices.
-
-
3. The method of claim 1, further comprising an additional video acquisition device connected to the computer and oriented toward the identification zone, wherein:
-
said step (a) further comprises acquiring a third image of the object from the third video acquisition device when the object is present in the identification zone;
said step (b) further comprises determining a third set of pixel coordinates and a third orientation angle for the end portion of the object from said third image;
said step (c) further comprises determining a third virtual line in the real world scene in accordance with said third set of pixel coordinates and said third orientation angle;
said step (d) further comprises determining the three-dimensional position coordinates of the object end portion by identifying coordinates of a virtual intersection of said first, second, and third virtual lines;
said step (e) further comprises determining a third parameter vector representative of a third linear projection of the object axis from said third image in accordance with said third set of pixel coordinates and said third orientation angle;
said step (f) further comprises determining parameters of a third virtual plane using said third parameter vector and a third perspective projection matrix defined in accordance with the orientation of said additional video acquisition device; and
said step (g) further comprises determining the three-dimensional orientation angles of the object by identifying orientation parameters of a third line defined by an intersection of said first, second, and third virtual planes.
-
-
4. The method of claim 1, wherein said orientation angles comprise azimuth and elevation angles.
-
5. The method of claim 1, wherein said object is a user'"'"'s hand and wherein said end portion is the user'"'"'s extended finger.
-
6. A method of tracking, in a real world scene, three-dimensional position coordinates and orientation angles of a user'"'"'s hand and of the user'"'"'s extended finger having an axis and a tip, in a real world scene, said method being implemented in a computer system having a first video acquisition device and a second video acquisition device for monitoring, from different positions, an identification zone defined in the real world scene, said method comprising the steps of:
-
(a) acquiring a first image of the user'"'"'s hand from the first video acquisition device and simultaneously acquiring a second image of the user'"'"'s hand from the second video acquisition device when the user'"'"'s hand is disposed within the identification zone;
(b) analyzing said first and second images to determine whether both said first and second images correspond to a pointing gesture of the user'"'"'s hand wherein one of the fingers of the hand is extended, and (1) when both said first and second images are determined to correspond to the pointing gesture of the user'"'"'s hand, identifying an end portion of the extended finger on each of said first and second images;
(2) when only one of said first and second images is determined to correspond to the pointing gesture of the user'"'"'s hand, repeating said step (a); and
(3) when both said first and second images are determined to not correspond to the pointing gesture of the user'"'"'s hand, repeating said step (a);
(c) determining a first set of pixel coordinates corresponding to the tip of the extended finger and a first orientation angle for the end portion of the extended finger from said first image and determining a second set of pixel coordinates corresponding to the tip of the extended finger and a second orientation angle for the end portion of the extended finger from said second image;
(d) determining a first virtual line in the real world scene in accordance with said first set of pixel coordinates and a first projection matrix defined in accordance with a monitoring orientation of the first video acquisition device, and determining a second virtual line in the real world scene in accordance with said second set of pixel coordinates and a second projection matrix defined in accordance with a monitoring orientation of the second video acquisition device;
(e) determining the three-dimensional position coordinates of the extended finger end portion by identifying coordinates of a virtual intersection of said first and second virtual lines;
(f) determining a first parameter vector representative of a first linear projection of the finger axis from said first image in accordance with said first set of pixel coordinates and said first orientation angle, and determining a second parameter vector representative of a second linear projection of the finger axis from said second image in accordance with said second set of pixel coordinates and said second orientation angle;
(g) determining the parameters of a first virtual plane using said first parameter vector and said first projection matrix, and determining a second virtual plane using said second parameter vector and said second projection matrix; and
(h) determining the three-dimensional orientation angles of the extended finger by identifying orientation parameters of a third line defined by an intersection of said first and said second virtual planes. - View Dependent Claims (7, 8, 9)
(i) prior to said step (a), calibrating the first and the second video acquisition devices.
-
-
8. The method of claim 6, further comprising an additional video acquisition device connected to the computer and oriented toward the identification zone, wherein:
-
said step (a) further comprises acquiring a third image of the user'"'"'s hand from the third video acquisition device when the extended finger, is present in the identification zone;
said step (c) further comprises determining a third set of pixel coordinates corresponding to the tip of the extended finger and a third orientation angle for the end portion of the extended finger from said third image;
said step (d) further comprises determining a third virtual line in the real world scene in accordance with said third set of pixel coordinates and said projection matrix;
said step (e) further comprises determining the three-dimensional position coordinates of the extended finger end portion by identifying coordinates of a virtual intersection of said first, second, and third virtual lines;
said step (f) further comprises determining a third parameter vector representative of a third linear projection of the extended finger axis from said third image in accordance with said third set of pixel coordinates and said third orientation angle;
said step (g) further comprises computing parameters of a third virtual plane using said third parameter vector and a third projection matrix defined in accordance with the orientation of the additional video acquisition device; and
said step (h) further comprises determining the three-dimensional orientation angles of the extended finger by identifying orientation parameters of a third line defined by an intersection of said first, second, and third virtual planes.
-
-
9. The method of claim 6, wherein said orientation angles comprise azimuth and elevation angles.
-
10. A system for tracking, in a real world scene, three-dimensional position coordinates and orientation angles of an object having an end portion, a tip and an axis, the system comprising:
-
a first video acquisition device and a second video acquisition device for monitoring, from different positions, an identification zone defined in the real world scene, and a computer connected to said first and second video acquisition devices and operable for;
acquiring a first image of the object from the first video acquisition device and simultaneously acquiring a second image of the object from the second video acquisition device when the object is present in the identification zone;
determining a first set of pixel coordinates corresponding to the tip of the object and a first orientation angle for the end portion of the object from said first image and determining a second set of pixel coordinates of the tip of the object and a second orientation angle for the end portion of the object from said second image;
determining a first virtual line in the real world scene in accordance with said first set of pixel coordinates and a first projection matrix defined by a monitoring orientation of said first video acquisition device, and defining a second virtual line in the real world scene in accordance with said second set of pixel coordinates and a second projection matrix defined by a monitoring orientation of said second video acquisition device;
determining the three-dimensional position coordinates of the object end portion by identifying coordinates of a virtual intersection of said first and second virtual lines;
determining a first parameter vector representative of a first linear projection of the object axis from said first image in accordance with said first set of pixel coordinates and said first orientation angle, and determining a second parameter vector representative of a second linear projection of the object axis from said second image in accordance with said second set of pixel coordinates and said second orientation angle;
computing parameters of a first virtual plane using said first parameter vector and said first projection matrix, and computing parameters of a second virtual plane using said second parameter vector and said second projection matrix; and
determining the three-dimensional orientation angles of the object by identifying orientation parameters of a third line defined by an intersection of said first and said second virtual planes. - View Dependent Claims (11, 12)
-
-
13. A system for tracking, in a real world scene, three-dimensional position coordinates and orientation angles of a user'"'"'s hand and of the user'"'"'s extended finger having an axis, the system comprising:
-
a first video acquisition device and a second video acquisition device for monitoring, from different positions, an identification zone defined in the real world scene, and a computer connected to said first and second video acquisition devices operable for;
acquiring a first image of the user'"'"'s hand from the first video acquisition device and simultaneously acquiring a second image of the user'"'"'s hand from the second video acquisition device when the user'"'"'s hand is disposed within the identification zone;
analyzing said first and second images to determine whether both said first and second images correspond to a pointing gesture of the user'"'"'s hand wherein one of the fingers of the hand is extended, and (1) when both said first and second images are determined to correspond to the pointing gesture of the user'"'"'s hand, identifying an end portion of the extended finger on each of said first and second images;
(2) when only one of said first and second images is determined to correspond to the pointing gesture of the user'"'"'s hand, acquiring a next first image of the user'"'"'s hand from the first video acquisition device and simultaneously acquiring a next second image of the user'"'"'s hand from the second video acquisition device; and
(3) when both said first and second images are determined to not correspond to the pointing gesture of the user'"'"'s hand, acquiring a next first image of the user'"'"'s hand from the first video acquisition device and simultaneously acquiring a next second image of the user'"'"'s hand from the second video acquisition device;
determining a first set of pixel coordinates and a first orientation angle for the end portion of the extended finger from said first image, and determining a second set of pixel coordinates and a second orientation angle for the end portion of the extended finger from said second image;
determining a first virtual line in the real world scene in accordance with said first set of pixel coordinates and a first projection matrix defined in accordance with a monitoring orientation of said first video acquisition device, and determining a second virtual line in the real world scene in accordance with said second set of pixel coordinates and a second projection matrix defined in accordance with a monitoring orientation of said second video acquisition device;
determining the three-dimensional position coordinates of the extended finger end portion by identifying coordinates of a virtual intersection of said first and second virtual lines;
determining a first parameter vector representative of a first linear projection of the finger axis from said first image in accordance with said first set of pixel coordinates and said first orientation angle, and determining a second parameter vector representative of a second linear projection of the finger axis from said second image in accordance with said second set of pixel coordinates and said second orientation angle;
computing parameters of a first virtual plane using said first parameter vector and said first projection matrix, and computing parameters of a second virtual plane suing said second parameter vector and said second projection matrix; and
determining the three-dimensional orientation angles of the extended finger by identifying orientation parameters of a third line defined by an intersection of said first and said second virtual planes. - View Dependent Claims (14)
-
Specification