Systems and methods for estimating depth using stereo array cameras
First Claim
1. A stereo array camera, comprising:
- a first array camera comprising a plurality of cameras that capture images of a scene from different viewpoints;
a single camera, where the single camera is spaced a known distance from the first array camera and captures at least one image of a scene from a different viewpoint to the viewpoints of the cameras in the first array camera;
a processor; and
memory in communication with the processor;
wherein software directs the processor to;
obtain a first set of images captured from different viewpoints using the first array camera and at least one image captured using the single camera, where the at least one image captured using the single camera is from a different viewpoint to the images in the first set of images;
select a reference viewpoint relative to the viewpoints of the cameras in the first array camera;
determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises;
identifying corresponding pixels in at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint;
determine whether the depth estimate for the given pixel location in the image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and
when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using the image captured by the single camera by;
identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
comparing the similarity of the corresponding pixels in the the at least one image from the first set of images captured by the first array camera and the image captured by the single camera identified as corresponding at each of the plurality of depths; and
selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
13 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for stereo imaging with camera arrays in accordance with embodiments of the invention are disclosed. In one embodiment, a method of generating depth information for an object using two or more array cameras that each include a plurality of imagers includes obtaining a first set of image data captured from a first set of viewpoints, identifying an object in the first set of image data, determining a first depth measurement, determining whether the first depth measurement is above a threshold, and when the depth is above the threshold: obtaining a second set of image data of the same scene from a second set of viewpoints located known distances from one viewpoint in the first set of viewpoints, identifying the object in the second set of image data, and determining a second depth measurement using the first set of image data and the second set of image data.
-
Citations
19 Claims
-
1. A stereo array camera, comprising:
-
a first array camera comprising a plurality of cameras that capture images of a scene from different viewpoints; a single camera, where the single camera is spaced a known distance from the first array camera and captures at least one image of a scene from a different viewpoint to the viewpoints of the cameras in the first array camera; a processor; and memory in communication with the processor; wherein software directs the processor to; obtain a first set of images captured from different viewpoints using the first array camera and at least one image captured using the single camera, where the at least one image captured using the single camera is from a different viewpoint to the images in the first set of images; select a reference viewpoint relative to the viewpoints of the cameras in the first array camera; determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises; identifying corresponding pixels in at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint; determine whether the depth estimate for the given pixel location in the image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using the image captured by the single camera by; identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels in the the at least one image from the first set of images captured by the first array camera and the image captured by the single camera identified as corresponding at each of the plurality of depths; and selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A stereo array camera, comprising:
-
a first array camera comprising a plurality of cameras wherein each of the plurality of cameras capture an image of a scene from a different viewpoint; a single camera located a fixed baseline distance from the first array camera, where the second camera captures an image of the scene from a viewpoint that is different from the viewpoint of each of the plurality of cameras in the first array camera and the single camera and the first array camera are set farther apart than the cameras in the first array camera; a processor; and memory in communication with the processor; wherein software directs the processor to; obtain a first set of images captured from different viewpoints using the first array camera, where the images in the first set of images are captured from different viewpoints; select a reference viewpoint relative to the viewpoints of the plurality of cameras used to capture the first set of images; determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises; identifying corresponding pixels in the each of at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint; determine whether a depth estimate for pixel locations in an image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using an image captured by the single camera by; identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels in the at least one image from the first set of images captured by the first array camera and the image captured the second camera identified as corresponding at each of the plurality of depths; and selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint; and generate a depth map using the depth estimates for pixel locations in an image from the reference viewpoint, where the depth map indicates distances of surfaces of scene objects from the reference viewpoint.
-
-
19. A stereo array camera, comprising:
-
a first array camera comprising a first plurality of cameras wherein each of the first plurality of cameras capture an image of a scene from a different viewpoint; a second array camera located a fixed baseline distance from the first array camera, where the second array camera comprises a second plurality of cameras wherein each of the second plurality of cameras capture an image of the scene from a different viewpoint to the viewpoints of the cameras in the first plurality of cameras in the first array camera and other cameras in the second plurality of cameras in the second array camera; a processor; and memory in communication with the processor; wherein software directs the processor to; obtain a first set of images captured from different viewpoints using the first array camera and a second set of images captured from different viewpoints using the second array camera, where the images in the first set of images and the second set of images are captured from different viewpoints; select a reference viewpoint relative to the viewpoints of the first plurality of cameras used to capture the first set of images; determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises; identifying corresponding pixels in at least two images having different viewpoints from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint; determine whether a depth estimate for the given pixel location in an image from the reference viewpoint determined using the at least two images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using an image in the second set of images captured by the second array camera by; identifying corresponding pixels in the at least one image from the first set of images captured by the first array camera and at least one image from the second set of images captured by the second array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths; comparing the similarity of the corresponding pixels in the at least one image captured by the first array camera and the at least one image captured by second array camera identified as corresponding at each of the plurality of depths; and selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
-
Specification