Method and apparatus for depth modelling and providing depth information of moving objects
First Claim
1. A method for estimating depth in an image sequence consisting of at least two frames, wherein recognizable points are followed in the frames, wherein:
- occlusion information, based on following recognizable points, is used to produce a bilinear model of depth, wherein the bilinear model comprises a score matrix and a loading matrix, the score matrix comprising column vectors called score vectors, the loading matrix comprising row vectors called loading vectors, the collection of one score vector, with a value for each frame, and one loading vector, with a value for each point, being called a factor, such that depth for each frame can be reconstructed by adding factor contributions from all factors, each factor contribution being the product of the score value corresponding to the frame and factor and the loading vector corresponding to the factor.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for indirect quantitative assessment or determination and modelling of the depth dimension of moving objects in signal streams where the depth information is not directly avaible, but where occlusion information can be derived. For this the method for estimating depth in an image sequence consisting of at least two frames comprises the steps of (1) selecting and characterizing recognizable points, (2) examining for each point in each frame whether it is visible or occluded, collecting this occlusion data in an occlusion list such that each frame corresponds to one row in the list and each point corresponds to a column in the list, such that elements in the list corresponding to visible points are given large, values and elements in the list corresponding to occluded points are given small values, (3) performing a Principal Component Analysis on the occlusion list, resulting in column vectors called score vectors and row vectors called loading vectors, the collection of one score vector with a value for each frame and one loading vector with a value for each point being called a factor, and (4) outputting the numerical value of each element of the loading vector of the first factor as depth information on the corresponding point, where a large numerical value indicates a point close to the camera or observer and a small numerical value indicates a point farther away.
-
Citations
11 Claims
-
1. A method for estimating depth in an image sequence consisting of at least two frames, wherein recognizable points are followed in the frames, wherein:
-
occlusion information, based on following recognizable points, is used to produce a bilinear model of depth, wherein the bilinear model comprises a score matrix and a loading matrix, the score matrix comprising column vectors called score vectors, the loading matrix comprising row vectors called loading vectors, the collection of one score vector, with a value for each frame, and one loading vector, with a value for each point, being called a factor, such that depth for each frame can be reconstructed by adding factor contributions from all factors, each factor contribution being the product of the score value corresponding to the frame and factor and the loading vector corresponding to the factor. - View Dependent Claims (2, 3, 4, 5, 6, 7)
(1) selecting and characterizing the recognizable points, (2) examining for each point in each frame whether it is visible or occluded, collecting this occlusion data in an occlusion list such that each frame corresponds to one row in the list and each point corresponds to a column in the list, such that elements in the list corresponding to visible points are given large values and elements in the list corresponding to occluded points are given small values, (3) performing a bilinear modelling on the occlusion list, and (4) outputting the numerical value of each element of the loading vector of the first factor as depth information on the corresponding point, where a large numerical value indicates a point close to the camera or observer and a small numerical value indicates a point farther away.
-
-
3. The method according to claim 1, the method comprising the steps of:
-
(1) selecting and characterizing recognizable points, (2) examining for each point in each frame whether it is visible or occluded, collecting this occlusion data in an occlusion list such that each observation of occlusion corresponds to one row in the list and each point corresponds to a column in the list, and such that occluded points are given small values, occluding points are given large values, and the remaining points are marked as missing values, (3) performing a bilinear modelling on the occlusion list, using a method capable of handling missing values, and (4) outputting the numerical value of each element of the loading vector of the first factor as depth information on the corresponding point, where a large numerical value indicates a point close to the camera or observer, a small numerical value indicates a point farther away, and a missing value indicates a point that can have any depth.
-
-
4. The method according to any one of claims 1 to 3,
wherein the occlusion list also comprises, for each point, one column for each of the coordinate dimensions of the image, the columns containing the coordinates for the point in each frame. -
5. The method according to any one of claims 1 to 3,
wherein the image sequence consists of at least three frames, wherein a number of relevant factors is chosen, such that the number is greater than or equal to 1, but smaller than or equal to the number of frames in the sequence, and only the part of the score and loading matrices corresponding to the number of relevant factors is used for producing an estimate of depth. -
6. The method according to any one of claims 1 to 3, wherein video objects are used instead of recognizable points.
-
7. A method for estimating depth according to any one of claims 1 to 3, wherein video objects are in a sequence consisting of at least two frames,
wherein the depth of each video object is found by computing a representative depth value for the object based on the depth values for the points inside the video object.
-
8. A method for estimating depth dependencies in a sequence consisting of at least two frames, the method comprising the steps of:
-
(1) defining and characterizing recognizable objects, (2) estimating motion for each object in each frame, (3) aggregating occlusion data in an occlusion matrix that has one row for each object and also one column for each object, such that when a first object with number A and a second object with number B have such motion that they overlap in a frame, then object A is reconstructed for the frame, an indicator of difference between reconstruction and original is computed, and the result is accumulated in position (A,B) in the occlusion matrix, then object B is reconstructed for the frame, an indicator of difference between reconstruction and original is computed, and the result is accumulated in position (B,A) in the occlusion matrix, (4) transforming the occlusion matrix into a graph, where each object is transformed into a node, and each non-zero element of the occlusion matrix is transformed into an edge from the node associated with the row of the element to the node associated with the column of the element, with the numerical values from the occlusion matrix as edge strength, (5) detecting and resolving any loops in the graph such that the weakest edges are removed, wherein the remaining edges in the graph represent depth dependencies between the objects in the sequence. - View Dependent Claims (9, 10)
(1) computing an occlusion matrix according to step (3) of claim 9 for each of the frames in the sequence except said frame, (2) predicting, by interpolation or extrapolation, the individual elements of the occlusion matrix for the wanted frame, (3) computing depth dependences based on the predicted occlusion matrix, using the steps (4)-(5) of claim 9, wherein the remaining edges in the graph represent depth dependencies between the objects in said frame.
-
-
11. An apparatus for estimating depth in an image sequence consisting of at least two frames, the apparatus comprising:
-
(1) a means for selecting and characterizing recognizable points, (2) a means for examining for each point in each frame whether it is visible or occluded, collecting this occlusion data in an occlusion list such that each frame corresponds to one row in the list and each point corresponds to a column in the list, and such that elements in the list corresponding to visible points are given large values and elements in the list corresponding to occluded points are given small values, (3) a means for performing a bilinear modelling on the occlusion list, resulting in a score matrix comprising column vectors called score vectors and a loading matrix comprising row vectors called loading vectors, the collection of one score vector, with a value for each frame, and one loading vector, with a value for each point, being called a factor, such that one row of the occlusion list can be reconstructed or approximated by adding factor contributions from all factors, each factor contribution being the product of the score value corresponding to the frame and factor and the loading vector corresponding to the factor, and (4) a means for outputting the numerical value of each element of the loading vector of the first factor as depth information on the corresponding point, where a large numerical value indicates a point close to the camera or observer and a small numerical value indicates a point farther away.
-
Specification