Method and apparatus for multiresolution object-oriented motion estimation
First Claim
Patent Images
1. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
- (1) successive low pass filtering and sub sampling of the first image, the first corresponding shape, the second image, the second corresponding shape and the hypothesis motion field, until a given coarsest resolution level is reached, thereby producing multi resolution representations, (2) setting a preliminary motion field on the coarsest resolution level equal to the coarsest hypothesis motion field, (3) estimating a motion field on the coarsest resolution level from the first image to the second image by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and the hypothesis motion field, and starting the following steps with the coarsest resolution level, (4) propagating and expanding the estimated motion field of the current coarse resolution level, producing a preliminary motion field for the next finer resolution level by taking into account the estimated motion field and the first shape of the coarse resolution level, the first image, the first shape and the second shape of the finer resolution level, (5) estimating a motion field on the finer resolution level from the first image to the second image producing an estimated motion field for the finer resolution level by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and by using the hypothesis motion field, said hypothesis motion field being used to improve the estimated motion field, all on the finer resolution level, (6) identifying the new coarse resolution level with the old finer resolution level and repeat steps (4) and (5) until the finest resolution level is reached.
5 Assignments
0 Petitions
Accused Products
Abstract
A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given and the motion fields have one motion vector for each valid pixel or valid block of pixels in the first image.
-
Citations
33 Claims
-
1. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) successive low pass filtering and sub sampling of the first image, the first corresponding shape, the second image, the second corresponding shape and the hypothesis motion field, until a given coarsest resolution level is reached, thereby producing multi resolution representations, (2) setting a preliminary motion field on the coarsest resolution level equal to the coarsest hypothesis motion field, (3) estimating a motion field on the coarsest resolution level from the first image to the second image by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and the hypothesis motion field, and starting the following steps with the coarsest resolution level, (4) propagating and expanding the estimated motion field of the current coarse resolution level, producing a preliminary motion field for the next finer resolution level by taking into account the estimated motion field and the first shape of the coarse resolution level, the first image, the first shape and the second shape of the finer resolution level, (5) estimating a motion field on the finer resolution level from the first image to the second image producing an estimated motion field for the finer resolution level by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and by using the hypothesis motion field, said hypothesis motion field being used to improve the estimated motion field, all on the finer resolution level, (6) identifying the new coarse resolution level with the old finer resolution level and repeat steps (4) and (5) until the finest resolution level is reached. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
(1) up sampling of the coarse resolution motion field, producing the fine resolution motion field taking into account the coarse resolution first shape and the fine resolution first shape, (2) calculating a degree of confidence for each motion vector of the fine resolution motion field taking into account the fine resolution first image, the fine resolution first shape and the fine resolution second shape, (3) replacing each motion vector in the fine resolution motion field with a weighted sum of motion vectors in a neighborhood around the motion vector, the weights being the degree of confidence for each motion vector, normalized by the sum of weights for the neighborhood, or replacing the values of each motion vector in the fine resolution motion field whose confidence is smaller than a given threshold with values extrapolated from the nearest neighbors with confidence larger than or equal to the threshold.
-
-
3. The method according to claim 2, wherein the degree of confidence depends on the gradient of the fine resolution first image taking into account the fine resolution first shape, a high gradient leading to a small degree of confidence, and/or wherein the degree of confidence is set to low values in areas where borders exist in the fine resolution first shape and not in the fine resolution second shape and vice versa.
-
4. The method according to claim 3, wherein the extension of the areas is correlated to the width of the filter used for sub sampling.
-
5. The method according to claim 2, wherein the degree of confidence is found by measuring how strong the displaced frame difference depends on a change of the motion field, or wherein the degree of confidence depends on the gradient of the fine resolution motion field, a high gradient leading to a small degree of confidence.
-
6. The method according to claim 1, wherein steps (3) and (5) of claim 1 comprise a method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a preliminary motion field and a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) estimating a motion field from the first image to the second image by taking into account the first image, the first shape, the second image, the second shape and the preliminary motion field, (2) calculating of an improved motion field by using individually for each pixel or block of pixels the hypothesis motion field and the estimated motion field taking into account the first image, the first shape, the second image and the second shape.
-
-
7. The method according to claim 6, further comprising the step of:
(3) filtering the improved motion field using an adaptive filtering technique, whose low pass character varies locally with the degree of confidence which can be obtained by the gradient of the first image.
-
8. The method according to claim 7, wherein in step (3) the vertical filtering depends only on the vertical component of the gradient, and the horizontal filtering depends only on the horizontal component of the gradient, and/or wherein the intensity gradient is calculated and the low pass character of the filter is weaker along the gradient and stronger perpendicular to the gradient.
-
9. The method according to claim 8, the method comprising the steps:
-
(1) calculating a gradient vector field of the first image and taking the absolute values of the components, producing a vertical and a horizontal component field, (2) applying a monotone transformation to the vertical component field in the way that the maximum value is mapped to zero and zero values are mapped to a given maximum filter range, producing a transformed vertical component field, or applying a monotone transformation to the vertical component field in the way that values above the minimum between the maximum value and a given number are mapped to zero and zero values are mapped to a given maximum filter range, producing a transformed vertical component field, (3) treat the horizontal component field analogous to step (2), producing a transformed horizontal component field, (4) applying a filter operation to each of the transformed vertical and horizontal component fields so that each value is decreased as long as the difference to one of its neighbors is bigger than one, thereby producing a vertical and a horizontal strength image for low pass filtering. (5) filtering the motion field according to the vertical and horizontal strength images for low pass filtering.
-
-
10. The method according to claim 7, wherein in step (3) the degree of confidence is found by measuring how strong the displaced frame difference depends on a change of the motion field.
-
11. The method according to claim 7, wherein the gradients of the motion field components are taken into account for calculating the degree of confidence.
-
12. The method according to claim 6, wherein step (1) of claim 6 comprises a method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a preliminary motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) forward warping of the first image and the first shape according to the preliminary motion field, producing predictions for the second image and second shape, (2) estimating motion from the predictions to the second image and the second shape, producing an offset difference motion field, (3) backward warping of the offset difference motion field and of the prediction of the second shape using the preliminary motion field, producing a difference motion field and a corresponding difference motion shape, (4) extrapolating each motion vector of the difference motion field on the first shape not common with the difference motion shape from the nearest neighbors given on the difference motion shape, (5) adding the difference motion field to the preliminary motion field, thereby producing the final motion field.
-
-
13. The method according to claim 6, wherein step (1) of claim 6 comprises a method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a preliminary motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) backward warping of the second image and of the second shape according to the preliminary motion field, producing predictions for the first image and the first shape. (2) estimating motion from the first image and the first shape to the predictions, producing a difference motion field, (3) adding the difference motion field to the preliminary motion field, thereby producing the final motion field.
-
-
14. The method according to claim 6, wherein step (2) of claim 6 comprises a method for estimating a motion field from a first image and a first shape to a second image and a second shape, wherein a first and a second preliminary motion field are given, the method combining the preliminary motion fields to produce an improved motion field, the method comprising the steps:
-
(1) forward warping of the first image and the first shape using the first preliminary motion field, producing first predictions of the second image and the second shape, (2) calculating a first residual as the difference, for each pixel or block of pixels, between the second image and the first prediction of the second image taking into account the second shape and the first prediction of the second shape, associating the difference with each pixel or block of pixels in the first image by warping the difference back using the first preliminary motion field, (3) forward warping of the first image and the first shape using the second preliminary motion field, producing second predictions of the second image and the second shape, (4) calculating a second residual as the difference, for each pixel or block of pixels, between the second image and the second prediction of the second image taking into account the second shape and the second prediction of the second shape, associating the difference with each pixel or block of pixels in the first image by warping the difference back using the second preliminary motion field, (5) computing a choice field having one choice value for each pixel or block of pixels in the first image by comparing the corresponding pixel or block of pixels of the first and second residual, the choice value indicating which of the two residuals is smaller, (6) composing a final motion field, taking motion vectors from the first motion field or second motion field based on the choice field.
-
-
15. The method according to claim 6, wherein step (2) of claim 6 comprises a method for estimating a motion field from a first image and a first shape to a second image and a second shape, wherein a first and a second preliminary motion field are given, the method combining the preliminary motion fields to produce an improved motion field, the method comprising the steps:
-
(1) backward warping of the second image and the second shape using the first preliminary motion field, producing first predictions of the first image and the first shape, (2) calculating a first residual as the difference, for each pixel or block of pixels, between the first image and the first prediction of the first image taking into account the first shape and the first prediction of the first shape, (3) backward warping of the second image and the second shape using the second preliminary motion field, producing second predictions of the first image and the first shape, (4) calculating a second residual as the difference, for each pixel or block of pixels, between the first image and the second prediction of the first image taking into account the first shape and the second prediction of the first shape, (5) computing a choice field having one choice value for each pixel or block of pixels in the first image by comparing the corresponding pixel or block of pixels of the first and second residual, the choice value indicating which of the two residuals is smaller, (6) composing a final motion field, taking motion vectors from the first motion field or second motion field based on the choice field.
-
-
16. The method according to claim 14, the method comprising the additional step:
(5b) median filtering the choice field.
-
17. The method according to claim 14, wherein more than two preliminary motion fields are given, steps (1) and (2), respectively (3) and (4), are repeated for each preliminary motion field, and step (5) is extended to more than two residuals.
-
18. The method according to claim 14, the method comprising the additional step:
(5c) replacing every value in the choice field, with a new value which minimizes a cost function.
-
19. The method according to claim 18, wherein the cost function is given by a weighted sum of the residual values and the corresponding roughness values of the choice field.
-
20. The method according to claim 14, wherein the residuals are filtered using a low-pass filter prior to step (5) of claim 15.
-
21. The method according to claim 14, wherein the residuals are given relative to how noticeable they are for the human visual system under consideration of masking effects.
-
22. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) forward warping of the first image and the first shape according to the hypothesis motion field, producing predictions for the second image and second shape, (2) estimating motion from the predictions to the second image and the second shape using a method according to claim 1, producing an offset difference motion field, (3) backward warping of the offset difference motion field and of the prediction of the second shape using the hypothesis motion field, producing a difference motion field and a corresponding difference motion shape, (4) extrapolating each motion vector of the difference motion field on the first shape not common with the difference motion shape from the nearest neighbors given on the difference motion shape, (5) adding the difference motion field to the hypothesis motion field, thereby producing the final motion field.
-
-
23. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) backward warping of the second image and of the second shape according to the hypothesis motion field, producing predictions for the first image and first shape, (2) estimating motion from the first image and the first shape to the predictions using a method according to claim 1, producing a difference motion field, (3) adding the difference motion field to the hypothesis motion field, thereby producing the final motion field.
-
-
24. The method according to claim 1, wherein the final motion field is replaced by that one of the given motion fields which leads to the best prediction.
-
25. The method according to claim 24, wherein in step (1) the hypothesis motion field is set to the motion field of the preceding estimation, or wherein in step (1) the hypothesis motion field is set to the sum of the motion field of the preceding estimation and the preceding change of motion, or wherein in step (1) the hypothesis motion field is set to the gum of the motion field of the preceding estimation and the weighted preceding change of motion.
-
26. A method for estimating motion within a sequence of related images with corresponding shapes, wherein motion estimation is performed from a first image to subsequent target images, the method comprising the steps:
-
(1) calculating a hypothesis motion field from the former estimated motion fields, (2) estimating the final motion field from the first image to the current target image using a method according to claim 1, with the hypothesis motion field of step (1).
-
-
27. A method for estimating motion within a sequence of related images with corresponding shapes, wherein motion estimation is performed from a subsequent set of images to a target image, the method comprising the steps:
-
(1) calculating a scaled motion field by scaling the motion field of the preceding estimation with respect to the position of the images in the sequence, (2) calculating a temporal motion field as the difference between the motion field of the preceding estimation and the scaled motion field, (3) forward warping of the scaled motion field and the shape of the preceding image using the temporal motion field, thereby producing a hypothesis motion field and a hypothesis shape, (4) extrapolating each motion vector of the hypothesis motion field on the shape of the current image not common with the hypothesis shape from the nearest neighbors given on the hypothesis shape, (5) estimating the final motion field from the current image to the target image using a method according to claim 1, with the hypothesis motion field.
-
-
28. A method for estimating motion within a sequence of related images with corresponding shapes, wherein motion estimation is performed from a subsequent set of images to a target image, the method comprising the steps:
-
(1) backward warping of the target image ad the target shape with the motion field of the preceding estimation, producing temporal predictions for the current image and shape, (2) estimating motion from the current image and shape to the temporal predictions by a method according to claim 1, producing a difference motion field, (3) backward warping of the motion field of the preceding estimation and the corresponding shape with the difference motion field, producing a temporal motion field and a temporal shape, (4) extrapolating each motion vector of the temporal motion field on the current shape not common with the temporal shape from the nearest neighbors given on the temporal shape, (5) adding the difference motion field to the temporal motion field, thereby producing the final motion field.
-
-
29. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field may be given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) estimating a temporal motion field by a method according to claim 1, (2) forward warping of the first image and the first shape according to the temporal motion field, producing predictions for the second image and second shape, (3) estimating motion from the predictions to the second image and the second shape, producing an offset difference motion field, (4) backward warping of the offset difference motion field and the prediction of the second shape using the temporal motion field, producing a difference motion field and a corresponding difference motion shape, (5) extrapolating each motion vector of the difference motion field on the first shape not common with the difference motion shape from the nearest neighbors given on the difference motion shape, (6) adding the difference motion field to the temporal motion field, thereby producing the final motion field.
-
-
30. A method for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field may be given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the method comprising the steps:
-
(1) estimating a temporal motion field by a method according to claim 1, (2) backward warping of the second image and of the second shape according to the temporal motion field, producing predictions for the first image and first shape, (3) estimating motion from the first image and the first shape to the predictions, producing a difference motion field, (4) adding the difference motion field to the temporal motion field, thereby producing the final motion field.
-
-
31. The method according to claim 1, wherein some methods or steps are applied in an iterative manner controlled by a control module.
-
32. An apparatus for estimating a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the apparatus comprising:
-
(1) means for successive low pass filtering and sub sampling of the first image, the first corresponding shape, the second image, the second corresponding shape and the hypothesis motion field, until a given coarsest resolution level is reached, thereby producing multi resolution representations, (2) means for setting a preliminary motion field on the coarsest resolution level equal to the coarsest hypothesis motion field, (3) means for estimating a motion field on the coarsest resolution level from the first image to the second image by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and the hypothesis motion field, and staring the following steps with the coarsest resolution level, (4) means for propagating and expanding the estimated motion field of the current coarse resolution level, producing a preliminary motion field for the next finer resolution level by taking into account the estimated motion field and the first shape of the coarse resolution level, the first image, the first shape and the second shape of the finer resolution level, (5) means for estimating a motion field on the finer resolution level from the first image to the second image producing an estimated motion field for the finer resolution level by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and by using the hypothesis motion field, said hypothesis motion field being used to improve the estimated motion field, all on the finer resolution level, (6) means for identifying the new coarse resolution level with the old finer resolution level and repeatedly applying said propagating means (4) and said estimating means (5) until the finest resolution level is reached.
-
-
33. A Computer program product comprising:
-
a computer-usable medium having computer-readable program code means embodied therein for causing said computer to estimate a motion field from a first image with a corresponding first shape to a second image with a corresponding second shape, wherein a hypothesis motion field is given, the motion fields having one motion vector for each valid pixel or valid block of pixels in the first image, the computer program product comprising;
(1) computer-readable program code means for causing a computer to successively low pass filter and sub sample the first image, the first corresponding shape, the second image, the second corresponding shape and the hypothesis motion field, until a given coarsest resolution level is reached, thereby producing multi resolution representations, (2) computer-readable program code means for causing a computer to set a preliminary motion field on the coarsest resolution level equal to the coarsest hypothesis motion field, (3) computer-readable program code means for causing a computer to estimate a motion field on the coarsest resolution level from the first image to the second image by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and the hypothesis motion field, and starting the following steps with the coarsest resolution level, (4) computer-readable program code means for causing a computer to propagate and expand the estimated motion field of the current coarse resolution level, producing a preliminary motion field for the next finer resolution level by taking into account the estimated motion field and the first shape of the coarse resolution level, the first image, the first shape and the second shape of the finer resolution level, (5) computer-readable program code means for causing a computer to estimate a motion field on the finer resolution level from the first image to the second image producing an estimated motion field for the finer resolution level by taking into account the first image, the first shape, the second image, the second shape, the preliminary motion field and by using the hypothesis motion field, said hypothesis motion field being used to improve the estimated motion field, all on the finer resolution level, (6) computer-readable program code means for causing a computer to identify the new coarse resolution level with the old finer resolution level and repeat steps (4) and (5) until the finest resolution level is reached.
-
Specification