Method and apparatus for estimating scene structure and ego-motion from multiple images of a scene using correlation

US 6,307,959 B1
Filed: 07/13/2000
Issued: 10/23/2001
Est. Priority Date: 07/14/1999
Status: Active Grant

First Claim

Patent Images

1. A method for estimating both three-dimensional (3D) scene structure and ego-motion from a batch of images of the scene obtained by a camera as it moves through the scene, the method comprising the steps of:

defining a reference image and a plurality of inspection images in the batch of images;

providing an initial estimate of the ego-motion and the scene structure for the batch of images;

responsive to the initial estimate of ego-motion and scene structure, globally correlating each of the inspection images to the reference image to define a global ego-motion constraint for all of the inspection images relative to the reference image;

refining the initial estimate of ego-motion based on the global ego-motion constraint;

responsive to the initial estimate of ego-motion and scene structure, locally correlating each of the inspection images to the reference image to define a plurality of local structure constraints for all of the inspection images relative to the reference image;

responsive to the plurality of local structure constraints, refining the initial estimate of scene structure in respective regions of the reference image corresponding to the plurality of local structure constraints.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system that estimates both the ego-motion of a camera through a scene and the structure of the scene by analyzing a batch of images of the scene obtained by the camera employs a correlation-based, iterative, multi-resolution algorithm. The system defines a global ego-motion constraint to refine estimates of inter-frame camera rotation and translation. It also uses local window-based correlation to refine the current estimate of scene structure. The batch of images is divided into a reference image and a group of inspection images. Each inspection image in the batch of images is aligned to the reference image by a warping transformation. The correlation is determined by analyzing respective Gaussian/Laplacian decompositions of the reference image and warped inspection images. The ego-motion constraint includes both rotation and translation parameters. These parameters are determined by globally correlating surfaces in the respective inspection images to the reference image. Scene structure is determined on a pixel-by-pixel basis by correlating multiple pixels in a support region among all of the images. The correlation surfaces are modeled as quadratic or other parametric surfaces to allow easy recognition and rejection of outliers and to simplify computation of incremental refinements for ego-motion and structure. The system can employ information from other sensors to provide an initial estimate of ego-motion and/or scene structure. The system operates using images captured by either single-camera rigs or multiple-camera rigs.

125 Citations

26 Claims

1. A method for estimating both three-dimensional (3D) scene structure and ego-motion from a batch of images of the scene obtained by a camera as it moves through the scene, the method comprising the steps of:
- defining a reference image and a plurality of inspection images in the batch of images;
  
  providing an initial estimate of the ego-motion and the scene structure for the batch of images;
  
  responsive to the initial estimate of ego-motion and scene structure, globally correlating each of the inspection images to the reference image to define a global ego-motion constraint for all of the inspection images relative to the reference image;
  
  refining the initial estimate of ego-motion based on the global ego-motion constraint;
  
  responsive to the initial estimate of ego-motion and scene structure, locally correlating each of the inspection images to the reference image to define a plurality of local structure constraints for all of the inspection images relative to the reference image;
  
  responsive to the plurality of local structure constraints, refining the initial estimate of scene structure in respective regions of the reference image corresponding to the plurality of local structure constraints.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. A method according to claim 1, further including the step of warping each inspection image into a coordinate system defined by the reference image using the initial estimates of ego-motion and scene structure to define a respective plurality of warped inspection images, prior to the step of defining the global ego-motion constraint;
    - wherein the steps of globally and locally correlating the inspection images to the reference images correlate the warped inspection images to the reference image.
  - 3. A method according to claim 2, wherein the step of defining the global ego-motion constraint includes the steps of:
4. A method according to claim 2, wherein the step of defining the local structure constraint includes the steps of:
- forming a wavelet decomposition of the reference image and each of the warped inspection images to provide a plurality of corresponding resolution levels for each of the reference image and the warped inspection images; and
  
  selecting a point in the reference image;
  
  defining a window of points around the selected point;
  
  correlating, in each resolution level of each inspection image, a window of points corresponding to the defined window of points to a respective window of points in the corresponding resolution level of the reference image, wherein the correlation results of lower resolution levels are used to guide the correlating of higher resolution levels.
5. A method according to claim 1, wherein the estimate of ego-motion includes estimates of rotation and translation and wherein:
- the step of globally correlating each of the inspection images to the reference image includes the steps of;
  
  for each inspection image of the plurality of inspection images, determining a correlation surface defining the correlation between the inspection image and the reference image to provide a respective plurality of correlation surfaces fitting each correlation surface of the plurality of correlation surfaces to a respective parametric surface to provide a respective plurality of parameterized correlation surfaces;
  
  classifying each parameterized correlation surface as a good correlation surface or as a bad correlation surface; and
  
  summing the good correlation surfaces to the relative exclusion of the bad correlation surfaces to provide a cumulative correlation surface; and
  
  the step of refining the initial estimate of ego-motion includes the steps of;
  
  assigning fixed values for the estimates of translation and scene structure;
  
  calculating a differential adjustment to the rotation estimate;
  
  assigning fixed values for the estimates of rotation and scene structure; and
  
  calculating a differential adjustment to the translation estimate.
6. A method according to claim 5, wherein the step of classifying each correlation surface as a good correlation surface or as a bad correlation surface includes the steps of:
- determining whether each parameterized correlation surface corresponds to an elliptic paraboloid;
  
  designating the parameterized correlation surfaces that correspond to elliptic paraboloids as good correlation surfaces and the quadratic correlation surfaces that do not correspond to elliptic paraboloids as bad correlation surfaces.
7. A method according to claim 1, wherein the step of providing an initial estimate of the ego-motion and scene structure uses information provided by sensing modalities that are independent of the camera.
8. A method according to claim 1, wherein the batch of images is provided by a single camera.
9. A method according to claim 1, wherein the batch of images are stereo images provided by two cameras having a fixed separation.
10. A method according to claim 1, wherein the step of providing an initial estimate of ego-motion and scene structure obtains the initial estimate of scene structure by preparing a depth map using the reference image and the inspection image that is the stereo image corresponding to the reference image.

11. Apparatus for estimating both three-dimensional (3D) scene structure and ego-motion from a batch of images of the scene comprising:
- at least one camera which obtains a batch of images including a reference image and a plurality of inspection images as the at least one camera moves through the scene;
  
  means for providing an initial estimate of the ego-motion and the scene structure for the batch of images;
  
  a correlation processor, responsive to the initial estimate of ego-motion and scene structure,
  
  1) to globally correlate each of the inspection images to the reference image, the correlation processor defining a global ego-motion constraint for all of the inspection images relative to the reference image and
  
  2) to locally correlate each of the inspection images to the reference image to define a plurality of local structure constraints for all of the inspection images relative to the reference image;
  
  a processor, coupled to the correlation processor to define a differential ego-motion estimate from the global ego-motion constraint and to define a differential structure estimate from the plurality of local structure constraints;
  
  a plurality of adders which add the differential ego-motion estimate to the initial ego-motion estimate to provide a refined ego-motion estimate and which add the differential structure estimate to the initial structure estimate to provide a refined structure estimate.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. Apparatus according to claim 11, further including a warping processor which, responsive to the initial ego-motion and structure estimates, warps each inspection image into a coordinate system defined by the reference image and provides the warped inspection images to the correlation processor.
  - 13. Apparatus according to claim 12, further including:
14. Apparatus according to claim 13, wherein the correlation processor, for each point in the reference image, defines a window of points around the selected point, correlates, in each resolution level of each inspection image, a window of points corresponding to the defined window of points to a respective window of points in the corresponding resolution level of the reference image.
15. Apparatus according to claim 11, further comprising additional sensing modalities that provide information regarding one of scene structure and ego-motion, and the apparatus further includes a processor which processes the information provided by the additional sensing modalities to provide the initial estimates of ego-motion and scene structure.
16. Apparatus according to claim 15, wherein the additional sensing modalities are selected from a group consisting essentially of a light amplification for detection and ranging (LADAR) system, an inertial navigation system and an odometry system.
17. Apparatus according to claim 11, wherein the at least one camera consists of a single camera.
18. Apparatus according to claim 11, wherein the at least one camera consists of two cameras having a fixed separation and the batch of images are corresponding stereo images.
19. Apparatus according to claim 18, further comprising a processor which receives the reference image and corresponding stereo image from the two cameras and processes the two images to generate a depth map of the scene wherein the depth map is provided as the initial estimate of scene structure.

20. An article of manufacture comprising a carrier containing computer program instructions, the computer program instructions controlling a general purpose computer to estimate both three-dimensional (3D) scene structure and ego-motion from a batch of images of the scene obtained by a camera as it moves through the scene, the computer program instructions causing the computer to perform the steps of:
- defining a reference image and a plurality of inspection images in the batch of images;
  
  providing an initial estimate of the ego-motion and the scene structure for the batch of images;
  
  responsive to the initial estimate of ego-motion and scene structure, globally correlating each of the inspection images to the reference image to define a global ego-motion constraint for all of the inspection images relative to the reference image;
  
  refining the initial estimate of ego-motion based on the global ego-motion constraint;
  
  responsive to the initial estimate of ego-motion and scene structure, locally correlating each of the inspection images to the reference image to define a plurality of local structure constraints for all of the inspection images relative to the reference image;
  
  responsive to the plurality of local structure constraints, refining the initial estimate of scene structure in respective regions of the reference image corresponding to the plurality of local structure constraints.
- View Dependent Claims (21, 22, 23)
- - 21. An article of manufacture according to claim 20, wherein the computer program instructions further cause the computer to warp each inspection image into a coordinate system defined by the reference image using the initial estimates of ego-motion and scene structure to define a respective plurality of warped inspection images, prior to the step of defining the global ego-motion constraint;
    - wherein, in the steps of globally and locally correlating the inspection images to the reference images the computer program instructions cause the computer to correlate the warped inspection images to the reference image.
  - 22. An article of manufacture according to claim 21, wherein the computer program instructions further cause the computer to form a Gaussian/Laplacian decomposition of the reference image and each of the warped inspection images to provide a plurality of corresponding resolution levels for each of the reference image and the warped inspection images and cause the computer to correlate each resolution level of each inspection image to the corresponding resolution level of the reference image.
  - 23. An article of manufacture according to claim 22, wherein the computer program instructions that cause the computer to define the local structure constraint include computer program instructions that cause the computer to form a wavelet decomposition of the reference image and each of the warped inspection images to provide a plurality of corresponding resolution levels for each of the reference image and the warped inspection images, select a point in the reference image, define a window of points around the selected point, and correlate, in each Laplacian level of each inspection image, a window of points corresponding to the defined window of points to a respective window of points in the corresponding resolution level of the reference image.

24. A method of generating a depth map of a scene from a sequence of images of the scene obtained by a camera moving through the scene, the method comprising the steps of:
- a) selecting a first batch of images from the sequence of images;
  
  b) processing the first batch of images to generate estimates of ego-motion of the camera and structure of the scene;
  
  c) projecting the estimated structure of the first batch of images into a world coordinate system;
  
  d) selecting a further batch of images from the sequence of images, the further batch of images having a reference image that is included in a previous batch of images;
  
  e) using the estimated ego-motion for the previous batch of images, mapping the structure estimate for the previous batch of images into a coordinate system defined by the reference image of the further batch of images;
  
  f) processing the further batch of images to generate further estimates of ego-motion of the camera and structure of the scene;
  
  g) projecting the further estimated structure into the world coordinate system to be combined with the previously projected estimated structure;
  
  h) repeating steps d) through g) until a last batch of images in the sequence of images has been processed; and
  
  i) providing the combined projected estimated structure as the depth map.
- View Dependent Claims (25, 26)
- - 25. A method according to claim 24, further including the step of removing the mapped structure estimate from the combined projected estimated structure before processing the further batch of images.
  - 26. A method according to claim 24, wherein the step of processing the further batch of images to generate further estimates of ego-motion of the camera and structure of the scene includes the steps of:

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SRI International, Inc.
Original Assignee
Sarnoff Corporation (SRI International, Inc.)
Inventors
Salgian, Garbis, Mandelbaum, Robert, Sawhney, Harpreet Singh
Primary Examiner(s)
Au, Amelia M.
Assistant Examiner(s)
Werner, Brian P.

Application Number

US09/616,005
Time in Patent Office

467 Days
Field of Search

382/103, 382/107, 382/104, 382/153, 382/154, 382/278, 382/260, 382/289, 382/295, 382/296, 382/299, 250/559.33, 348/116, 348/119, 348/47-50, 348/94, 348/95, 356/612, 356/614, 356/394, 901/47, 701/300, 701/301, 702/152, 702/153, 700/253, 700/255, 700/259
US Class Current

382/154
CPC Class Codes

G06T 7/207 for motion estimation over ...

G06T 7/55 from multiple images

Method and apparatus for estimating scene structure and ego-motion from multiple images of a scene using correlation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

125 Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for estimating scene structure and ego-motion from multiple images of a scene using correlation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

125 Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links