Visual tracking framework
First Claim
1. A method of tracking a region in a sequence of video images, the method comprising:
- identifying a frame of a video sequence on which to perform region tracking for a region defined using markers in another frame of the video sequence, wherein a model for the region has been generated using the video sequence, the model being based on user-selected positions for the markers in multiple reference frames of the video sequence, and wherein the model includes an average image and multiple component images, the average image representing an average appearance of the region in the reference frames, and each of the component images representing differences between the reference frames and the average image;
tracking the region in the frame of the video sequence using a multi-pass search that adapts the model to the frame, wherein the multi-pass search associates the model with a first location in the frame and performs a plurality of optimization procedures in an iterative manner seeking to improve a match between a model transformation and the frame to generate a revised model associated with a second location in the frame; and
recording the revised model transformation and the second location as an outcome of tracking the region in the frame.
0 Assignments
0 Petitions
Accused Products
Abstract
A computer program product tangibly embodied in a computer-readable storage medium includes instructions that when executed by a processor perform a method. The method includes identifying a frame of a video sequence, transforming a model into an initial guess for how the region appears in the frame, performing an exhaustive search of the frame, performing a plurality of optimization procedures, wherein at least one additional model parameter is taken into account as each subsequent optimization procedure is initiated. A system includes a computer readable storage medium, a graphical user interface, an input device, a model for texture and shape of the region, the model generated using the video sequence and stored in the computer readable storage medium, and a solver component.
-
Citations
25 Claims
-
1. A method of tracking a region in a sequence of video images, the method comprising:
-
identifying a frame of a video sequence on which to perform region tracking for a region defined using markers in another frame of the video sequence, wherein a model for the region has been generated using the video sequence, the model being based on user-selected positions for the markers in multiple reference frames of the video sequence, and wherein the model includes an average image and multiple component images, the average image representing an average appearance of the region in the reference frames, and each of the component images representing differences between the reference frames and the average image; tracking the region in the frame of the video sequence using a multi-pass search that adapts the model to the frame, wherein the multi-pass search associates the model with a first location in the frame and performs a plurality of optimization procedures in an iterative manner seeking to improve a match between a model transformation and the frame to generate a revised model associated with a second location in the frame; and recording the revised model transformation and the second location as an outcome of tracking the region in the frame. - View Dependent Claims (2, 3)
-
-
4. A method of tracking a region in a sequence of video frames, the method comprising:
-
identifying a frame of a video sequence on which to perform region tracking for a region defined using markers in another frame of the video sequence, wherein a model for the region has been generated using multiple reference frames of the video sequence selected by a user along with morph images between the reference frames, each morph image representing an intermediate appearance of the region generated at a location among the reference frames selected by;
(i) identifying a total number of the morph images to be generated;
(ii) computing, for every pair of the reference images, a distance value representing how much the pair of reference images differ with respect to shape change and gray level change;
(iii) adding the distance values to form a total distance value;
(iv) dividing the total distance value by the total number of the morph images to obtain a per- morph-image distance value; and
(v) distributing the morph images among the reference frames based on the distance values and the per-morph-image distance value;tracking the region in the frame of the video sequence using a multi-pass search that adapts the model to the frame, wherein the multi-pass search associates the model with a first location in the frame and performs a plurality of optimization procedures in an iterative manner seeking to improve a match between a model transformation and the frame to generate a revised model associated with a second location in the frame; and recording the revised model transformation and the second location as an outcome of tracking the region in the frame. - View Dependent Claims (5)
-
-
6. A method of tracking an image feature in a sequence of video frames, the method comprising:
-
receiving a video sequence comprising a plurality of video frames; selecting a plurality of reference frames from the plurality of video frames, the plurality of reference frames reflecting the image feature to be tracked under varying conditions; placing one or more markers on each reference frame to identify a plurality of reference images representing the image feature under the varying conditions; generating a plurality of morph images from the plurality of reference images, each morph image representing an intermediate appearance of the image feature between consecutive reference images; generating a model for texture and shape of the image feature from the reference images and the morph images; tracking the feature in a frame of the video sequence using a multi-pass search that adapts the model to the frame. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer program product embodied in a non-transitory computer-readable storage medium and comprising instructions that, when executed by a processor, perform a method comprising:
-
identifying a frame of a video sequence on which to perform region tracking for a region defined using markers in another frame of the video sequence, wherein a model for the region has been generated using the video sequence, the model being based on user-selected positions for the markers in multiple reference frames of the video sequence, and wherein the model includes an average image and multiple component images, the average image representing an average appearance of the region in the reference frames, and each of the component images representing differences between the reference frames and the average image; tracking the region in the frame of the video sequence using a multi-pass search that adapts the model to the frame, wherein the multi-pass search associates the model with a first location in the frame and performs a plurality of optimization procedures in an iterative manner seeking to improve a match between a model transformation and the frame to generate a revised model associated with a second location in the frame; and recording the revised model transformation and the second location as an outcome of tracking the region in the frame. - View Dependent Claims (15)
-
-
16. A computer program product embodied in a non-transitory computer-readable storage medium and comprising instructions that, when executed by a processor, perform a method comprising:
-
identifying a frame of a video sequence on which to perform region tracking for a region defined using markers in another frame of the video sequence, wherein a model for the region has been generated using multiple reference frames of the video sequence selected by a user along with morph images between the reference frames, each morph image representing an intermediate appearance of the region generated at a location among the reference frames selected by;
(i) identifying a total number of the morph images to be generated;
(ii) computing, for every pair of the reference images, a distance value representing how much the pair of reference images differ with respect to shape change and gray level change;
(iii) adding the distance values to form a total distance value;
(iv) dividing the total distance value by the total number of the morph images to obtain a per-morph-image distance value; and
(v) distributing the morph images among the reference frames based on the distance values and the per-morph-image distance value;tracking the region in the frame of the video sequence using a multi-pass search that adapts the model to the frame, wherein the multi-pass search associates the model with a first location in the frame and performs a plurality of optimization procedures in an iterative manner seeking to improve a match between a model transformation and the frame to generate a revised model associated with a second location in the frame; and recording the revised model transformation and the second location as an outcome of tracking the region in the frame. - View Dependent Claims (17)
-
-
18. A computer program product embodied in a non-transitory computer-readable storage medium and comprising instructions that, when executed by a processor, perform a method comprising:
-
receiving a video sequence comprising a plurality of video frames; selecting a plurality of reference frames from the plurality of video frames, the plurality of reference frames reflecting the image feature to be tracked under varying conditions; placing one or more markers on each reference frame to identify a plurality of reference images representing the image feature under the varying conditions; generating a plurality of morph images from the plurality of reference images, each morph image representing an intermediate appearance of the image feature between consecutive reference images; generating a model for texture and shape of the image feature from the reference images and the morph images; tracking the feature in a frame of the video sequence using a multi-pass search that adapts the model to the frame. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
Specification