SYSTEMS AND METHODS FOR INITIALIZING MOTION TRACKING OF HUMAN HANDS USING TEMPLATE MATCHING WITHIN BOUNDED REGIONS
6 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for initializing motion tracking of human hands within bounded regions are disclosed. One embodiment includes: a processor; reference and alternate view cameras; and memory containing a plurality of templates that are rotated and scaled versions of a base template. In addition, a hand tracking application configures the processor to: obtain reference and alternate view frames of video data; generate a depth map; identify at least one bounded region within the reference frame of video data containing pixels having distances from the reference camera that are within a specific range of distances; determine whether any of the pixels within the at least one bounded region are part of a human hand; track the motion of the part of the human hand in a sequence of frames of video data obtained from the reference camera; and confirm that the tracked motion corresponds to a predetermined initialization gesture.
32 Citations
113 Claims
-
1-83. -83. (canceled)
-
84. A real-time gesture based interactive system, comprising:
-
a processor; a reference camera configured to capture sequences of frames of video data, where each frame of video data comprises intensity information for a plurality of pixels; an alternate view camera configured to capture sequences of frames of video data, where each frame of video data comprises intensity information for a plurality of pixels; and memory containing a hand tracking application; and wherein the hand tracking application configures the processor to; obtain a reference frame of video data from the reference camera; obtain an alternate view frame of video data from the alternate view camera; generate a depth map containing distances from the reference camera for pixels in the reference frame of video data using information including the disparity between corresponding pixels within the reference and alternate view frames of video data; and identify at least one bounded region within the reference frame of video data containing pixels having distances from the reference camera that are within a specific range of distances from the reference camera; determine whether any of the pixels within the at least one bounded region within the reference frame are part of a human hand; obtain a sequence of frames of video data from the reference camera; track the motion of the part of the human hand visible in the sequence of frames of video data; confirm that the tracked motion of the part of the human hand visible in the sequence of frames of video data corresponds to a predetermined initialization gesture; and commence tracking the human hand as part of a gesture based interactive session. - View Dependent Claims (85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111)
-
-
112. A real-time gesture based interactive system, comprising:
-
a processor; a reference camera configured to capture sequences of frames of video data, where each frame of video data comprises color information for a plurality of pixels; an alternate view camera configured to capture sequences of frames of video data, where each frame of video data comprises color information for a plurality of pixels; and memory containing; a hand tracking application; and a set of edge feature templates comprising a plurality of edge feature templates that are rotated and scaled versions of a base template; wherein the hand tracking application configures the processor to; obtain a reference frame of video data from the reference camera; obtain an alternate view frame of video data from the alternate view camera; generate a depth map containing distances from the reference camera for pixels in the reference frame of video data using information including the disparity between corresponding pixels within the reference and alternate view frames of video data; identify at least one bounded region within the reference frame of video data containing pixels having distances from the reference camera that are within a specific range of distances from the reference camera; determine whether any of the pixels within the at least one bounded region within the reference frame are part of a human hand visible in the sequence of frames of video data, where a part of a human hand is identified by searching the frame of video data for a grouping of pixels that have image gradient orientations that match the edge features of one of the plurality of edge feature templates; obtain a sequence of frames of video data from the reference camera; track the motion of the part of the human hand visible in the sequence of frames of video data; confirm that the tracked motion of the part of the human hand visible in the sequence of frames of video data corresponds to a predetermined initialization gesture, where the predetermined initialization gesture comprises a finger oscillating from side to side within a predetermined gesture range; initialize the image capture settings of the reference camera used during the gesture based interactive session by adjusting the exposure and gain of the reference camera as additional frames of video data are captured by the reference camera so that the brightness of at least one pixel that is part of a human hand visible in the additional frames of video data satisfies a predetermined criterion; and commence tracking the human hand as part of a gesture based interactive session.
-
-
113. A method of commencing tracking of a human hand using a real-time gesture based interactive system, comprising:
-
capturing a reference frame of video data using a reference camera, where the reference frame of video data comprises intensity information for a plurality of pixels; capturing an alternate view frame of video data using an alternate view camera, where the alternate view frame of video data comprises intensity information for a plurality of pixels; generating a depth map containing distances from the reference camera for pixels in the reference frame of video data using a processor configured by a hand tracking application and information including the disparity between corresponding pixels within the reference and alternate view frames of video data; identifying at least one bounded region within the reference frame of video data containing pixels having distances from the reference camera that are within a specific range of distances from the reference camera using the processor configured by the hand tracking application; determining whether any of the pixels within the at least one bounded region within the reference frame are part of a human hand visible in the reference frame of video data using the processor configured using the hand tracking application, where a part of a human hand is identified by searching the reference frame of video data for a grouping of pixels that have image gradient orientations that match the edge features of one of the plurality of edge feature templates; obtaining a sequence of frames of video data from the reference camera; tracking the motion of the part of the human hand visible in the sequence of frames of video data using the processor configured using the hand tracking application; confirming that the tracked motion of the part of the human hand visible in the sequence of frames of video data corresponds to a predetermined initialization gesture using the processor configured using the hand tracking application; and commence tracking the human hand as part of a gesture based interactive session using the processor configured using the hand tracking application.
-
Specification