Method and apparatus for processing video sequences
First Claim
Patent Images
1. A method for processing a video sequence comprised of a plurality of frames, said method comprising:
- extracting a feature from each of said frames;
determining correspondences between said extracted feature from said frames;
determining motion in said video sequence based on said determined correspondences, said determining motion using a modified random sample consensus algorithm that selects samples from buckets, iterates a model estimation multiple times including all inliers obtained so far in each iteration, and finds a model that maximizes a data likelihood, wherein a motion hypothesis is derived with a least squares method when the obtained inliers are determined to be less than a number, and the motion hypothesis is derived with a weighted total least squares method otherwise;
generating a forward warping matrix and a background warping matrix for each of said frames based on said determined motion;
generating a forward warping error and a backward warping error for each of said frames based on said forward warping matrix and said background warping matrix;
generating a foreground/background mask for each of said frames based on said forward warping error and said backward warping error; and
generating a background mosaic by mapping said frames to a common coordinate system; and
extracting foreground information from each of said frames based on said background mosaic.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for processing a video sequence having a plurality of frames includes the steps of: extracting features from each of the frames, determining correspondences between the extracted features from two of the frames, estimating motion in the video sequence based on the determined correspondences, generating a background mosaic for the video sequence based on the estimated motion, and performing foreground-background segmentation on each of the frames based on the background mosaic.
33 Citations
18 Claims
-
1. A method for processing a video sequence comprised of a plurality of frames, said method comprising:
-
extracting a feature from each of said frames; determining correspondences between said extracted feature from said frames; determining motion in said video sequence based on said determined correspondences, said determining motion using a modified random sample consensus algorithm that selects samples from buckets, iterates a model estimation multiple times including all inliers obtained so far in each iteration, and finds a model that maximizes a data likelihood, wherein a motion hypothesis is derived with a least squares method when the obtained inliers are determined to be less than a number, and the motion hypothesis is derived with a weighted total least squares method otherwise; generating a forward warping matrix and a background warping matrix for each of said frames based on said determined motion; generating a forward warping error and a backward warping error for each of said frames based on said forward warping matrix and said background warping matrix; generating a foreground/background mask for each of said frames based on said forward warping error and said backward warping error; and generating a background mosaic by mapping said frames to a common coordinate system; and extracting foreground information from each of said frames based on said background mosaic. - View Dependent Claims (2, 3, 4, 5, 16, 17, 18)
-
-
6. An apparatus for processing a video sequence comprised of a plurality of frames, said apparatus comprising:
a processor configured to extract a feature from each of said frames, to determine correspondences between said extracted feature from said frames, to determine motion in said video sequence based on said determined correspondences, said determination of motion being performed using a modified random sample consensus algorithm that selects samples from buckets, iterates a model estimation multiple times including all inliers obtained so far in each iteration, and finds a model that maximizes a data likelihood, wherein a motion hypothesis is derived with a least squares method when the obtained inliers are determined to be less than a number, and the motion hypothesis is derived with a weighted total least squares method otherwise, to generate a forward warping matrix and a background warping matrix for each of said frames based on said determined motion, to generate a forward warping error and a backward warping error for each of said frames based on said forward warping matrix and said background warping matrix, to generate a foreground/background mask for each of said frames based on said forward warping error and said backward warping error, and to generate a background mosaic by mapping said frames to a common coordinate system, and to extract foreground information from each of said frames based on said background mosaic. - View Dependent Claims (7, 8, 9, 10)
-
11. An apparatus for processing a video sequence comprised of a plurality of frames, said apparatus comprising circuitry configured to perform:
-
extracting a feature from each of said frames; determining correspondences between said extracted features from said frames; determining motion in said video sequence based on said determined correspondences, said determining using a modified random sample consensus algorithm that selects samples from buckets, iterates a model estimation multiple times including all inliers obtained so far in each iteration, and finds a model that maximizes a data likelihood, wherein a motion hypothesis is derived with a least squares method when the obtained inliers are determined to be less than a number, and the motion hypothesis is derived with a weighted total least squares method otherwise; generating a forward warping matrix and a background warping matrix for each of said frames based on said determined motion; generating a forward warping error and a backward warping error for each of said frames based on said forward warping matrix and said background warping matrix; generating a foreground/background mask for each of said frames based on said forward warping error and said backward warping error; and generating a background mosaic by mapping said frames to a common coordinate system; and extracting foreground information from each of said frames based on said background mosaic. - View Dependent Claims (12, 13, 14, 15)
-
Specification