Method for estimating a pose of an articulated object model
First Claim
1. A computer-implemented method for estimating a pose of an articulated object model, wherein the articulated object model is a computer based 3D model that is executed on a digital computer or a computer system comprising a computer memory and a processing unit coupled to the computer memory, wherein the computer based 3D model is of a real world object observed by one or more source cameras, and the articulated object model represents a plurality of joints and of links that link the joints, and wherein the pose of the articulated object model is defined by the spatial location of the joints, the method comprising the steps of:
- obtaining at least one source image from a video stream comprising a view of the real world object recorded by a source camera;
processing by the processing unit the at least one source image to extract a corresponding source image segment comprising the view of the real world object separated from the image background;
maintaining, in a database in computer readable form, a set of reference silhouettes, each reference silhouette being associated with an articulated object model and with a particular reference pose of this articulated object model;
comparing by the processing unit the at least one source image segment to the reference silhouettes and selecting a predetermined number of reference silhouettes by taking into account, for each reference silhouette,wherein the processing unit determines a matching error that indicates how closely the reference silhouette matches the source image segment and/orwherein the processing unit determines a coherence error that indicates how much the reference pose is consistent with the pose of the same real world object as estimated from at least one of preceding and following source images of the video stream;
retrieving by the processing unit the reference poses of the articulated object models associated with the selected of reference silhouettes; and
computing by the processing unit an estimate of the pose of the articulated object model from the reference poses of the selected reference silhouettes.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for estimating a pose of an articulated object model that is a computer based 3D model of a real world object observed by one or more source cameras, including the steps of obtaining a source image from a video stream; processing the source image to extract a source image segment maintaining, in a database, a set of reference silhouettes, each being associated with an articulated object model and a corresponding reference pose; comparing the source image segment to the reference silhouettes and selecting reference silhouettes by taking into account, for each reference silhouette, a matching error that indicates how closely the reference silhouette matches the source image segment retrieving the corresponding reference poses of the articulated object models; and computing an estimate of the pose of the articulated object model from the reference poses of the selected reference silhouettes.
20 Citations
15 Claims
-
1. A computer-implemented method for estimating a pose of an articulated object model, wherein the articulated object model is a computer based 3D model that is executed on a digital computer or a computer system comprising a computer memory and a processing unit coupled to the computer memory, wherein the computer based 3D model is of a real world object observed by one or more source cameras, and the articulated object model represents a plurality of joints and of links that link the joints, and wherein the pose of the articulated object model is defined by the spatial location of the joints, the method comprising the steps of:
-
obtaining at least one source image from a video stream comprising a view of the real world object recorded by a source camera; processing by the processing unit the at least one source image to extract a corresponding source image segment comprising the view of the real world object separated from the image background; maintaining, in a database in computer readable form, a set of reference silhouettes, each reference silhouette being associated with an articulated object model and with a particular reference pose of this articulated object model; comparing by the processing unit the at least one source image segment to the reference silhouettes and selecting a predetermined number of reference silhouettes by taking into account, for each reference silhouette, wherein the processing unit determines a matching error that indicates how closely the reference silhouette matches the source image segment and/or wherein the processing unit determines a coherence error that indicates how much the reference pose is consistent with the pose of the same real world object as estimated from at least one of preceding and following source images of the video stream; retrieving by the processing unit the reference poses of the articulated object models associated with the selected of reference silhouettes; and computing by the processing unit an estimate of the pose of the articulated object model from the reference poses of the selected reference silhouettes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method for estimating a pose of an articulated object model, wherein the articulated object model is a computer based 3D model that is executed on a digital computer or a computer system comprising a computer memory and a processing unit coupled to the computer memory, wherein the computer based 3D model is of a real world object observed by two or more source cameras, and the articulated object model represents a plurality of joints and of links that link the joints, and wherein the pose of the articulated object model is defined by the spatial location of the joints, called 3D joint positions, the method comprising the steps of
determining by the processing unit an initial estimate of the 3D pose, that is, the 3D joint positions of the articulated object model; -
associating by the processing unit each link with one or more projection surfaces, wherein the projection surfaces are surfaces defined in the 3D model, and the position and orientation of each projection surface is determined by the position and orientation of the associated link; iteratively adapting the 3D joint positions by the processing unit, for each joint, wherein the processing unit determines a position score assigned to its 3D joint position, the position score being a measure of the degree to which image segments from the different source cameras, when projected onto the projection surfaces of links adjacent to the joint, are consistent which each other; wherein the processing unit varies the 3D joint position of the joint until an optimal position score is achieved; repeating the step of iteratively adapting the 3D joint positions for all joints for a predetermined number of times or until the position scores converge. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A computer-implemented method for rendering a virtual image as seen from a virtual camera, given an articulated object model, wherein the articulated object model is a computer based 3D model that is executed on a digital computer or a computer system comprising a computer memory and a processing unit coupled to the computer memory, wherein the computer based 3D model is of a real world object observed by two or more source cameras, and the articulated object model represents a plurality of joints and of links that link the joints, and wherein the pose of the articulated object model is defined by the spatial location of the joints, the method comprising the steps of:
-
determining by the processing unit an estimate of the 3D pose, that is, the 3D joint positions of the articulated object model; associating by the processing unit each link with one or more projection surfaces, wherein the projection surfaces are surfaces defined in the 3D model, and the position and orientation of each projection surface is determined by the position and orientation of the associated link; wherein the projection surfaces, for each link, comprise a fan of billboards, each billboard being associated with a source camera, and each billboard being a planar surface spanned by its associated link and a vector that is normal to both this link and to a line connecting a point of the link to the source camera; for each source camera, projecting segments of the associated source image onto the associated billboard, creating billboard images; for each link, projecting the billboard images into the virtual image and blending the billboard images to form a corresponding part of the virtual image.
-
-
15. A computer-implemented method for determining a segmentation of a source image segment, the method comprising the steps of:
-
obtaining at least one source image from a video stream comprising a view of a real world object recorded by a source camera; processing by a processing unit the at least one source image to extract a corresponding source image segment comprising the view of the real world object separated from the image background; maintaining, in a database in computer readable form, a set of reference silhouettes, each reference silhouette being associated with a reference segmentation, the reference segmentation defining sub-segments of the reference silhouette, each sub-segment being assigned a unique label; determining by the processing unit a matching reference silhouette which most closely resembles the source image segment and retrieving the reference segmentation of the reference silhouette; for each sub-segment, overlaying both a thickened and a thinned version of the sub-segment over the source image segment and labelling the source image pixels which lie within both the thickened and the thinned version with the label of the sub-segment; labelling all remaining pixels of the source image segment as unconfident pixels; for each sub-segment, determining by the processing unit a colour model that is representative of the colour of the pixels labelled with the sub-segment'"'"'s label; labelling the unconfident pixels according to the colour model, by assigning each of the unconfident pixel to a sub-segment whose colour model most closely fits the colour of each of the unconfident pixels.
-
Specification