Systems and methods for parallax detection and correction in images captured using array cameras that contain occlusions using subsets of images to perform depth estimation

US 8,619,082 B1
Filed: 08/21/2013
Issued: 12/31/2013
Est. Priority Date: 08/21/2012
Status: Active Grant

First Claim

Patent Images

1. A method of estimating distances to objects within a scene from a light field comprising a set of images captured from different viewpoints using a processor configured by an image processing application, the method comprising:

selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints;

normalizing the set of images to increase the similarity of corresponding pixels within the set of images;

determining initial depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where an initial depth estimate for a given pixel location in the image from the reference viewpoint is determined by;

identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;

comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and

selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as an initial depth estimate for the given pixel location in the image from the reference viewpoint;

identifying corresponding pixels in the set of images using the initial depth estimates;

comparing the similarity of the corresponding pixels in the set of images to detect mismatched pixels;

when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the current depth estimate for the pixel location in the image from the reference viewpoint;

when an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, selecting the current depth estimate for the pixel location in the image from the reference viewpoint by;

determining a set of candidate depth estimates using a plurality of different subsets of the set of images;

identifying corresponding pixels in each of the plurality of subsets of the set of images based upon the candidate depth estimates; and

selecting the candidate depth of the subset having the most similar corresponding pixels as the current depth estimate for the pixel location in the image from the reference viewpoint.

View all claims

13 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems in accordance with embodiments of the invention can perform parallax detection and correction in images captured using array cameras. Due to the different viewpoints of the cameras, parallax results in variations in the position of objects within the captured images of the scene. Methods in accordance with embodiments of the invention provide an accurate account of the pixel disparity due to parallax between the different cameras in the array, so that appropriate scene-dependent geometric shifts can be applied to the pixels of the captured images when performing super-resolution processing. In several embodiments, detecting parallax involves using competing subsets of images to estimate the depth of a pixel location in an image from a reference viewpoint. In a number of embodiments, generating depth estimates considers the similarity of pixels in multiple spectral channels. In certain embodiments, generating depth estimates involves generating a confidence map indicating the reliability of depth estimates.

Citations

28 Claims

1. A method of estimating distances to objects within a scene from a light field comprising a set of images captured from different viewpoints using a processor configured by an image processing application, the method comprising:
- selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints;
  
  normalizing the set of images to increase the similarity of corresponding pixels within the set of images;
  
  determining initial depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where an initial depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as an initial depth estimate for the given pixel location in the image from the reference viewpoint;
  
  identifying corresponding pixels in the set of images using the initial depth estimates;
  
  comparing the similarity of the corresponding pixels in the set of images to detect mismatched pixels;
  
  when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the current depth estimate for the pixel location in the image from the reference viewpoint;
  
  when an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, selecting the current depth estimate for the pixel location in the image from the reference viewpoint by;
  
  determining a set of candidate depth estimates using a plurality of different subsets of the set of images;
  
  identifying corresponding pixels in each of the plurality of subsets of the set of images based upon the candidate depth estimates; and
  
  selecting the candidate depth of the subset having the most similar corresponding pixels as the current depth estimate for the pixel location in the image from the reference viewpoint.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- - 2. The method of claim 1, wherein selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints comprises selecting a viewpoint from the set consisting of:
    - the viewpoint of one of the images; and
      
      a virtual viewpoint.
  - 3. The method of claim 1, wherein a pixel in a given image from the set of images that corresponds to a pixel location in the image from the reference viewpoint is determined by applying a scene dependent shift to the pixel location in the image from the reference viewpoint that is determined based upon:
    - the depth estimate of the pixel location in the image from the reference viewpoint; and
      
      the baseline between the viewpoint of the given image and the reference viewpoint.
  - 4. The method of claim 1, wherein the subsets of the set of images used to determine the set of candidate depth estimates are selected based upon the viewpoints of the images in the sets of images to exploit patterns of visibility characteristic of natural scenes that are likely to result in at least one subset in which a given pixel location in the image from the reference viewpoint is visible in each image in the subset.
  - 5. The method of claim 4, wherein:
    - the set of images are captured within multiple color channels;
      
      selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints comprises selecting one of the images as a reference image and selecting the viewpoint of the reference image as the reference viewpoint; and
      
      the subsets of the set of images used to determine the set of candidate depth estimates are selected so that the same number of images in the color channel containing the reference image appears in each subset.
  - 6. The method of claim 5, wherein the subsets of the set of images used to determine the set of candidate depth estimates are also selected so that there are at least two images in the color channels that do not contain the reference image in each subset.
  - 7. The method of claim 1, further comprising:
    - determining the visibility of the pixels in the set of images from the reference viewpoint by;
      
      identifying corresponding pixels in the set of images using the current depth estimates; and
      
      determining that a pixel in a given image is not visible in the image from the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels.
  - 8. The method of claim 7, wherein:
    - selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints comprises selecting one of the images in the set of images as a reference image and selecting the viewpoint of the reference image as the reference viewpoint; and
      
      determining that a pixel in a given image is not visible in the image from the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels further comprises comparing the pixel in the given image to the corresponding pixel in the reference image.
  - 9. The method of claim 8, wherein the photometric similarity criterion comprises a similarity threshold that adapts based upon at least the intensity of at least one of the pixel in the given image and the pixel in the reference image.
  - 10. The method of claim 8, wherein the photometric similarity criterion comprises a similarity threshold that adapts as a function of the photometric distance between the corresponding pixel from the reference image and the corresponding pixel that is most similar to the pixel from the reference image.
  - 11. The method of claim 7, wherein:
    - selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints comprises selecting a virtual viewpoint as the reference viewpoint; and
      
      determining that a pixel in a given image is not visible in the image from the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels further comprises;
      
      selecting an image adjacent the virtual viewpoint as a reference image; and
      
      comparing the pixel in the given image to the corresponding pixel in the reference image.
  - 12. The method of claim 7, further comprising updating the depth estimate for a given pixel location in the image from the reference viewpoint based upon the visibility of the pixels in the set of images from the reference viewpoint by:
    - generating an updated subset of the set of images using images in which the given pixel location in the image from the reference viewpoint is determined to be visible based upon the current depth estimate for the given pixel;
      
      identifying pixels in the updated subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
      
      comparing the similarity of the corresponding pixels in the updated subset of images identified at each of the plurality of depths; and
      
      selecting the depth from the plurality of depths at which the identified corresponding pixels in the updated subset of the set of images have the highest degree of similarity as an updated depth estimate for the given pixel location in the image from the reference viewpoint.
  - 13. The method of claim 1, wherein normalizing the set of images to increase the similarity of corresponding pixels within the set of images further comprises:
    - utilizing calibration information to correct for photometric variations and scene-independent geometric distortions in the images in the set of images; and
      
      rectification of the images in the set of images.
  - 14. The method of claim 13, wherein normalizing the set of images to increase the similarity of corresponding pixels within the set of images further comprises resampling the images to increase the similarity of corresponding pixels in the set of images;
    - andthe scene-independent geometric corrections applied to the images are determined at a sub-pixel resolution.
  - 15. The method of claim 1, wherein a cost function is utilized to determine the similarity of corresponding pixels.
  - 16. The method of claim 15, wherein determining the similarity of corresponding pixels further comprises spatially filtering the calculated costs.
  - 17. The method of claim 15, wherein selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as an initial depth estimate for the given pixel location in the image from the reference viewpoint further comprises selecting the depth from the plurality of depths at which the filtered cost function for the identified corresponding pixels indicates the highest level of similarity.
  - 18. The method of claim 15, wherein the cost function utilizes at least one similarity measure selected from the group consisting of:
    - the L1 norm of a pair of corresponding pixels;
      
      the L2 norm of a pair of corresponding pixels; and
      
      the variance of a set of corresponding pixels.
  - 19. The method of claim 15, wherein the set of images are captured within multiple color channels and the cost function determines the similarity of pixels in each of the multiple color channels.
  - 20. The method of claim 1, further comprising generating confidence metrics for the current depth estimates for pixel locations in the image from the reference viewpoint.
  - 21. The method of claim 20, wherein the confidence metric encodes a plurality of confidence factors.
  - 22. The method of claim 1, further comprising:
    - detecting occlusion of pixels in images within the set of images that correspond to specific pixel locations in the image from the reference viewpoint based upon the initial depth estimates by searching along lines parallel to the baselines between the reference viewpoint and the viewpoints of the images in the set of images to locate occluding pixels;
      
      when an initial depth estimate results in the detection of a corresponding pixel in at least one image being occluded, selecting the current depth estimate for the pixel location in the image from the reference viewpoint by;
      
      determining a set of candidate depth estimates using a plurality of different subsets of the set of images that exclude the at least one image in which the given pixel is occluded;
      
      identifying corresponding pixels in each of the plurality of subsets of the set of images based upon the candidate depth estimates; and
      
      selecting the candidate depth of the subset having the most similar corresponding pixels as the current depth estimate for the pixel location in the image from the reference viewpoint.
  - 23. The method of claim 22, wherein searching along lines parallel to the baselines between the reference viewpoint and the viewpoints of the images in the set of images to locate occluding pixels further comprises determining that a pixel corresponding to a pixel location (x₁, y₁) in an image from the reference viewpoint is occluded in an alternate view image by a pixel location (x₂, y₂) in the image from the reference viewpoint when
    |s₂−
    - s₁−
      
      √
      
      {square root over ((x₂−
      
      x₁)²+(y₂−
      
      y₁)²)}{square root over ((x₂−
      
      x₁)²+(y₂−
      
      y₁)²)}|≦
      
      thresholdwhere s₁and s₂are scene dependent geometric shifts applied to pixel locations (x₁, y₁) and pixel (x₂, y₂) to shift the pixels along a line parallel to the baseline between the reference viewpoint and the viewpoint of the alternate view image to shift the pixels into the viewpoint of the alternate view image based upon the initial depth estimates for each pixel.
  - 24. The method of claim 22, wherein the decision to designate a pixel as being occluded considers at least one of the similarity of the pixels and the confidence of the estimated depths of the pixels (x₁, y₁) and (x₂, y₂).

25. A method of synthesizing a higher resolution image from a light field comprising a set of lower resolution images captured from different viewpoints, the method comprising:
- estimating distances to objects within a scene from the light field comprising a set of images captured from different viewpoints using a processor configured by an image processing application, the method comprising;
  
  selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints;
  
  normalizing the set of images to increase the similarity of corresponding pixels within the set of images;
  
  determining initial depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where an initial depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as an initial depth estimate for the given pixel location in the image from the reference viewpoint;
  
  identifying corresponding pixels in the set of images using the initial depth estimates;
  
  comparing the similarity of the corresponding pixels in the set of images to detect mismatched pixels;
  
  when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the current depth estimate for the pixel location in the image from the reference viewpoint;
  
  when an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, selecting the current depth estimate for the pixel location in the image from the reference viewpoint by;
  
  determining a set of candidate depth estimates using a plurality of different subsets of the set of images;
  
  identifying corresponding pixels in each of the plurality of subsets of the set of images based upon the candidate depth estimates; and
  
  selecting the candidate depth of the subset having the most similar corresponding pixels as the current depth estimate for the pixel location in the image from the reference viewpoint;
  
  determining the visibility of the pixels in the set of images from the reference viewpoint by;
  
  identifying corresponding pixels in the set of images using the current depth estimates; and
  
  determining that a pixel in a given image is not visible in the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels; and
  
  fusing pixels from the set of images using the processor configured by the image processing application based upon the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by;
  
  identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the visibility information; and
  
  applying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the current depth estimates; and
  
  fusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images.
- View Dependent Claims (26)
- - 26. The method of claim 25, further comprising synthesizing an image from the reference viewpoint using the processor configured by the image processing application to perform a super-resolution process based upon the fused image from the reference viewpoint, the set of images captured from different viewpoints, the current depth estimates, and the visibility information.

27. An image processing system, comprising:
- a processor;
  
  memory containing a set of images captured from different viewpoints and an image processing application;
  
  wherein the image processing application configures the processor to;
  
  select a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints;
  
  normalize the set of images to increase the similarity of corresponding pixels within the set of images;
  
  determine initial depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where an initial depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as an initial depth estimate for the given pixel location in the image from the reference viewpoint;
  
  identify corresponding pixels in the set of images using the initial depth estimates;
  
  compare the similarity of the corresponding pixels in the set of images to detect mismatched pixels;
  
  when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, select the initial depth estimate as the current depth estimate for the pixel location in the image from the reference viewpoint;
  
  when an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, select the current depth estimate for the pixel location in the image from the reference viewpoint by;
  
  determining a set of candidate depth estimates using a plurality of different subsets of the set of images;
  
  identifying corresponding pixels in each of the plurality of subsets of the set of images based upon the candidate depth estimates; and
  
  selecting the candidate depth of the subset having the most similar corresponding pixels as the current depth estimate for the pixel location in the image from the reference viewpoint.
- View Dependent Claims (28)
- - 28. The image processing system of claim 27, wherein the image processing application further configures the processor to:
    - determine the visibility of the pixels in the set of images from the reference viewpoint by;
      
      identifying corresponding pixels in the set of images using the current depth estimates; and
      
      determining that a pixel in a given image is not visible in the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels; and
      
      fuse pixels from the set of images using the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by;
      
      identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the visibility information; and
      
      applying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the current depth estimates; and
      
      fusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
FotoNation Limited (Adeia Inc.)
Original Assignee
Pelican Imaging Corporation
Inventors
Ciurea, Florian, Venkataraman, Kartik, Molina, Gabriel, Lelescu, Dan
Primary Examiner(s)
Tung, Kee M
Assistant Examiner(s)
Du, Haixia

Application Number

US13/972,881
Time in Patent Office

132 Days
Field of Search

None
US Class Current

345/427
CPC Class Codes

G02B 27/0075   with means for altering, e....

G06T 15/20   Perspective computation

G06T 2200/21   involving computational pho...

G06T 2207/10012   Stereo images

G06T 2207/10024   Color image

G06T 2207/10052   Images from lightfield camera

G06T 7/557   from light fields, e.g. fro...

G06T 7/593   from stereo images

G06T 7/85   Stereo camera calibration

H04N 13/128   Adjusting depth or disparity

H04N 13/232   using fly-eye lenses, e.g. ...

H04N 13/243   using three or more 2D imag...

H04N 2013/0081   Depth or disparity estimati...

H04N 2013/0088   Synthesising a monoscopic i...

H04N 23/16   Optical arrangements associ...

Systems and methods for parallax detection and correction in images captured using array cameras that contain occlusions using subsets of images to perform depth estimation

First Claim

13 Assignments

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for parallax detection and correction in images captured using array cameras that contain occlusions using subsets of images to perform depth estimation

First Claim

13 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links