Controlled human pose estimation from depth image streams
First Claim
Patent Images
1. A computer based method for estimating a pose of a human actor, the method comprising:
- receiving a depth image of the human actor;
detecting a head, neck, and trunk (H-N-T) template of the human actor based on the depth image;
generating a skeleton image of the human actor;
detecting a plurality of end points of the skeleton image;
determining whether self occlusion is present in the depth image, the determining comprising;
determining whether four of the detected end points correspond to hands and feet of the human actor;
determining whether lengths of skeletonized branches between the detected end points corresponding to the hands and feet of the human actor and corresponding entry points of the H-N-T template exceed a minimum distance;
determining that self occlusion is not present if (1) four of the detected end points correspond to the hands and feet of the human actor, and (2) the lengths of the skeletonized branches exceed the minimum distance; and
determining that self occlusion is present if (1) no more than three of the detected end points correspond to the hands and feet of the human actor, or (2) the length of at least one of the skeletonized branches does not exceed the minimum distance;
responsive to self occlusion being determined present in the depth image,detecting a plurality of limb regions in the depth image,calculating a probability for each pixel of each detected limb region for a likelihood of the pixel belonging to a particular limb in the depth image, andassigning a label for each limb region based on the calculated probabilities of its pixels;
detecting a plurality of features of the human actor based on the H-N-T and template the labels assigned to the limb regions; and
estimating a pose of the human actor in a human model based on the features and kinematic constraints of the human model.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, and computer program product for estimating human body pose are described. According to one aspect, anatomical features are detected in a depth image of a human actor. The method detects a head, neck, and trunk (H-N-T) template in the depth image, and detects limbs in the depth image based on the H-N-T template. The anatomical features are detected based on the H-N-T template and the limbs. An estimated pose of a human model is estimated based on the detected features and kinematic constraints of the human model.
34 Citations
22 Claims
-
1. A computer based method for estimating a pose of a human actor, the method comprising:
-
receiving a depth image of the human actor; detecting a head, neck, and trunk (H-N-T) template of the human actor based on the depth image; generating a skeleton image of the human actor; detecting a plurality of end points of the skeleton image; determining whether self occlusion is present in the depth image, the determining comprising; determining whether four of the detected end points correspond to hands and feet of the human actor; determining whether lengths of skeletonized branches between the detected end points corresponding to the hands and feet of the human actor and corresponding entry points of the H-N-T template exceed a minimum distance; determining that self occlusion is not present if (1) four of the detected end points correspond to the hands and feet of the human actor, and (2) the lengths of the skeletonized branches exceed the minimum distance; and determining that self occlusion is present if (1) no more than three of the detected end points correspond to the hands and feet of the human actor, or (2) the length of at least one of the skeletonized branches does not exceed the minimum distance; responsive to self occlusion being determined present in the depth image, detecting a plurality of limb regions in the depth image, calculating a probability for each pixel of each detected limb region for a likelihood of the pixel belonging to a particular limb in the depth image, and assigning a label for each limb region based on the calculated probabilities of its pixels; detecting a plurality of features of the human actor based on the H-N-T and template the labels assigned to the limb regions; and estimating a pose of the human actor in a human model based on the features and kinematic constraints of the human model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for estimating a pose of a human actor, the system comprising:
-
a computer processor for executing executable computer program code; a computer-readable storage medium containing the executable computer program code for performing a method comprising; receiving a depth image of the human actor; detecting a head, neck, and trunk (H-N-T) template of the human actor based on the depth image; generating a skeleton image of the human actor; detecting a plurality of end points of the skeleton image; determining whether self occlusion is present in the depth image, the determining comprising; determining whether four of the detected end points correspond to hands and feet of the human actor; determining whether lengths of skeletonized branches between the detected end points corresponding to the hands and feet of the human actor and corresponding entry points of the H-N-T template exceed a minimum distance; determining that self occlusion is not present if (1) four of the detected end points correspond to the hands and feet of the human actor, and (2) the lengths of the skeletonized branches exceed the minimum distance; and determining that self occlusion is present if (1) no more than three of the detected end points correspond to the hands and feet of the human actor, or (2) the length of at least one of the skeletonized branches does not exceed the minimum distance; responsive to self occlusion being determined present in the depth image, detecting a plurality of limb regions in the depth image, calculating a probability for each pixel of each detected limb region for a likelihood of the pixel belonging to a particular limb in the depth image, the probability calculated based on (1) previously generated predicted limb poses from an earlier-in-time depth image and (2) previous occlusion states of the limbs from the earlier-in-time depth image, and assigning a label for each limb region based on the calculated probabilities of its pixels; detecting a plurality of features of the human actor based on the H-N-T and template the labels assigned to the limb regions; and estimating a pose of the human actor in a human model based on the features and kinematic constraints of the human model. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
12. The method of 1, wherein the probability for each pixel of each detected limb region for the likelihood of the pixel belonging to the particular limb in the depth image is calculated based on (1) previously generated predicted limb poses from an earlier-in-time depth image and (2) previous occlusion states of the limbs from the earlier-in-time depth image.
-
22. The system of 11, wherein the probability for each pixel of each detected limb region for the likelihood of the pixel belonging to the particular limb in the depth image is calculated based on (1) previously generated predicted limb poses from an earlier-in-time depth image and (2) previous occlusion states of the limbs from the earlier-in-time depth image.
Specification