Systems and methods for estimating depth using stereo array cameras

US 9,800,859 B2
Filed: 05/06/2015
Issued: 10/24/2017
Est. Priority Date: 03/15/2013
Status: Active Grant

First Claim

Patent Images

1. A stereo array camera, comprising:

a first array camera comprising a plurality of cameras that capture images of a scene from different viewpoints;

a single camera, where the single camera is spaced a known distance from the first array camera and captures at least one image of a scene from a different viewpoint to the viewpoints of the cameras in the first array camera;

a processor; and

memory in communication with the processor;

wherein software directs the processor to;

obtain a first set of images captured from different viewpoints using the first array camera and at least one image captured using the single camera, where the at least one image captured using the single camera is from a different viewpoint to the images in the first set of images;

select a reference viewpoint relative to the viewpoints of the cameras in the first array camera;

determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises;

identifying corresponding pixels in at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;

comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and

selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint;

determine whether the depth estimate for the given pixel location in the image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and

when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using the image captured by the single camera by;

identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;

comparing the similarity of the corresponding pixels in the the at least one image from the first set of images captured by the first array camera and the image captured by the single camera identified as corresponding at each of the plurality of depths; and

selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.

View all claims

13 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for stereo imaging with camera arrays in accordance with embodiments of the invention are disclosed. In one embodiment, a method of generating depth information for an object using two or more array cameras that each include a plurality of imagers includes obtaining a first set of image data captured from a first set of viewpoints, identifying an object in the first set of image data, determining a first depth measurement, determining whether the first depth measurement is above a threshold, and when the depth is above the threshold: obtaining a second set of image data of the same scene from a second set of viewpoints located known distances from one viewpoint in the first set of viewpoints, identifying the object in the second set of image data, and determining a second depth measurement using the first set of image data and the second set of image data.

Citations

19 Claims

1. A stereo array camera, comprising:
- a first array camera comprising a plurality of cameras that capture images of a scene from different viewpoints;
  
  a single camera, where the single camera is spaced a known distance from the first array camera and captures at least one image of a scene from a different viewpoint to the viewpoints of the cameras in the first array camera;
  
  a processor; and
  
  memory in communication with the processor;
  
  wherein software directs the processor to;
  
  obtain a first set of images captured from different viewpoints using the first array camera and at least one image captured using the single camera, where the at least one image captured using the single camera is from a different viewpoint to the images in the first set of images;
  
  select a reference viewpoint relative to the viewpoints of the cameras in the first array camera;
  
  determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises;
  
  identifying corresponding pixels in at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint;
  
  determine whether the depth estimate for the given pixel location in the image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and
  
  when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using the image captured by the single camera by;
  
  identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels in the the at least one image from the first set of images captured by the first array camera and the image captured by the single camera identified as corresponding at each of the plurality of depths; and
  
  selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The stereo array camera of claim 1, wherein the first array camera and the single camera are set farther apart than the cameras in the first array camera.
  - 3. The stereo array camera of claim 2, wherein the first array camera and the single camera are located a fixed baseline distance apart.
  - 4. The stereo array camera of claim 1, wherein the single camera forms part of a second array camera comprising a plurality of cameras that capture images of a scene from different viewpoints.
  - 5. The stereo array camera of claim 4, wherein the first and second array cameras have the same number of cameras, and include cameras having the same resolution.
  - 6. The stereo array camera of claim 5, wherein the cameras in the first and second array cameras have the same arrangement of color filters.
  - 7. The stereo array camera of claim 1, wherein the baseline distance between the first array camera and the single camera is variable.
  - 8. The stereo array camera of claim 7, wherein:
    - the first array camera and the single camera further comprise internal sensors including gyroscopes and accelerometers; and
      
      software further directs the processor to estimate the baseline distance between the first array camera and the single camera from extrinsics determined from matching features in an image from the first set of images captured by the first array camera and an image in the second set of images captured by the second camera in combination with information from the gyroscopes and accelerometers.
  - 9. The stereo array camera of claim 1, wherein the first array camera forms an M×
    - N array of cameras.
  - 10. The stereo array camera of claim 1, wherein software further directs the processor to select the plurality of depths at which pixels in images in the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint are identified during refinement of the depth estimate based upon the depth estimate initially determined using the images in the first set of images captured by the first array camera.
  - 11. The stereo array camera of claim 1, wherein software further directs the processor to generate a depth map using the depth estimates for pixel locations in an image from the reference viewpoint, where the depth map indicates distances of surfaces of scene objects from the reference viewpoint.
  - 12. The stereo array camera of claim 11, wherein software further directs the processor to generate a depth map by identifying pixels in the image captured by the single camera corresponding to pixels for which depth estimates were determined using images in the first set of images captured by the first array camera and applying depth estimates determined using images from the first set of images captured by the first array camera to the corresponding pixels.
  - 13. The stereo array camera of claim 11, wherein software further configures the processor to synthesize an image from the first set of images captured by the first array camera using the depth map.
  - 14. The stereo array camera of claim 11, wherein software further configures the processor to synthesize an image from the first set of images captured by the first array camera and the image captured by the single camera using the depth map.
  - 15. The stereo array camera of claim 1, wherein the cameras in the first array camera and the single camera are cameras that image portions of the spectral band selected from the group consisting of red, blue, green, infrared, and extended color.
  - 16. The stereo array camera of claim 1, wherein the cameras in the first array camera form a π
    - filter group.
  - 17. The stereo array camera of claim 16, wherein:
    - the single camera forms part of a second array camera comprising a plurality of cameras that capture images of a scene from different viewpoints; and
      
      the cameras in the second array camera form a π
      
      filter group.

18. A stereo array camera, comprising:
- a first array camera comprising a plurality of cameras wherein each of the plurality of cameras capture an image of a scene from a different viewpoint;
  
  a single camera located a fixed baseline distance from the first array camera, where the second camera captures an image of the scene from a viewpoint that is different from the viewpoint of each of the plurality of cameras in the first array camera and the single camera and the first array camera are set farther apart than the cameras in the first array camera;
  
  a processor; and
  
  memory in communication with the processor;
  
  wherein software directs the processor to;
  
  obtain a first set of images captured from different viewpoints using the first array camera, where the images in the first set of images are captured from different viewpoints;
  
  select a reference viewpoint relative to the viewpoints of the plurality of cameras used to capture the first set of images;
  
  determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises;
  
  identifying corresponding pixels in the each of at least two images from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint;
  
  determine whether a depth estimate for pixel locations in an image from the reference viewpoint determined using the images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and
  
  when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using an image captured by the single camera by;
  
  identifying corresponding pixels in at least one image from the first set of images captured by the first array camera and the image captured by the single camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels in the at least one image from the first set of images captured by the first array camera and the image captured the second camera identified as corresponding at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint; and
  
  generate a depth map using the depth estimates for pixel locations in an image from the reference viewpoint, where the depth map indicates distances of surfaces of scene objects from the reference viewpoint.

19. A stereo array camera, comprising:
- a first array camera comprising a first plurality of cameras wherein each of the first plurality of cameras capture an image of a scene from a different viewpoint;
  
  a second array camera located a fixed baseline distance from the first array camera, where the second array camera comprises a second plurality of cameras wherein each of the second plurality of cameras capture an image of the scene from a different viewpoint to the viewpoints of the cameras in the first plurality of cameras in the first array camera and other cameras in the second plurality of cameras in the second array camera;
  
  a processor; and
  
  memory in communication with the processor;
  
  wherein software directs the processor to;
  
  obtain a first set of images captured from different viewpoints using the first array camera and a second set of images captured from different viewpoints using the second array camera, where the images in the first set of images and the second set of images are captured from different viewpoints;
  
  select a reference viewpoint relative to the viewpoints of the first plurality of cameras used to capture the first set of images;
  
  determine depth estimates for pixel locations in an image from the reference viewpoint using the images in the first set of images captured by the first array camera, wherein generating a depth estimate for a given pixel location in the image from the reference viewpoint comprises;
  
  identifying corresponding pixels in at least two images having different viewpoints from the first set of images captured by the first array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint;
  
  determine whether a depth estimate for the given pixel location in an image from the reference viewpoint determined using the at least two images in the first set of images captured by the first array camera corresponds to an observed disparity below a predetermined threshold; and
  
  when the depth estimate corresponds to an observed disparity below the predetermined threshold, refining the depth estimate using an image in the second set of images captured by the second array camera by;
  
  identifying corresponding pixels in the at least one image from the first set of images captured by the first array camera and at least one image from the second set of images captured by the second array camera that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels in the at least one image captured by the first array camera and the at least one image captured by second array camera identified as corresponding at each of the plurality of depths; and
  
  selecting a depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
FotoNation Limited (Adeia Inc.)
Original Assignee
Fotonation Cayman Limited (Adeia Inc.)
Inventors
Venkataraman, Kartik, Gallagher, Paul, Jain, Ankit, Nisenzon, Semyon
Primary Examiner(s)
Le, Peter D

Application Number

US14/705,885
Publication Number

US 20150245013A1
Time in Patent Office

902 Days
Field of Search
US Class Current
CPC Class Codes

G01P 3/38   using photographic means

G06T 2207/10012   Stereo images

G06T 2207/10021   Stereoscopic video; Stereos...

G06T 7/285   using a sequence of stereo ...

G06T 7/55   from multiple images

G06T 7/557   from light fields, e.g. fro...

G06T 7/579   from motion

G06T 7/593   from stereo images

H04N 13/243   using three or more 2D imag...

Systems and methods for estimating depth using stereo array cameras

First Claim

13 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for estimating depth using stereo array cameras

First Claim

13 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links