Systems and methods for performing depth estimation using image data from multiple spectral channels

US 8,780,113 B1
Filed: 12/30/2013
Issued: 07/15/2014
Est. Priority Date: 08/21/2012
Status: Active Grant

First Claim

Patent Images

1. A method of estimating distances to objects within a scene from a light field comprising a set of images captured from different viewpoints and within multiple color channels using a processor configured by an image processing application, the method comprising:

selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints and within multiple color channels;

normalizing the set of images to increase the similarity of pixels from the set of images that correspond in a specific color channel from the multiple color channels;

determining depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where a depth estimate for a given pixel location in the image from the reference viewpoint is determined by;

identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;

in each of a plurality of color channels selected from the multiple color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; and

selecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.

View all claims

13 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems in accordance with embodiments of the invention can perform parallax detection and correction in images captured using array cameras. Due to the different viewpoints of the cameras, parallax results in variations in the position of objects within the captured images of the scene. Methods in accordance with embodiments of the invention provide an accurate account of the pixel disparity due to parallax between the different cameras in the array, so that appropriate scene-dependent geometric shifts can be applied to the pixels of the captured images when performing super-resolution processing. In a number of embodiments, generating depth estimates considers the similarity of pixels in multiple spectral channels. In certain embodiments, generating depth estimates involves generating a confidence map indicating the reliability of depth estimates.

209 Citations

30 Claims

1. A method of estimating distances to objects within a scene from a light field comprising a set of images captured from different viewpoints and within multiple color channels using a processor configured by an image processing application, the method comprising:
- selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints and within multiple color channels;
  
  normalizing the set of images to increase the similarity of pixels from the set of images that correspond in a specific color channel from the multiple color channels;
  
  determining depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where a depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  in each of a plurality of color channels selected from the multiple color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. The method of claim 1, wherein a cost function is utilized to determine the similarity of the pixels identified as corresponding in the plurality of color channels.
  - 3. The method of claim 2, wherein determining the similarity of corresponding pixels further comprises spatially filtering the calculated costs.
  - 4. The method of claim 3, wherein the spatial filtering of the calculated costs utilizes a filter selected from the group consisting of:
    - a fixed-coefficient filter; and
      
      an edge-preserving filter.
  - 5. The method of claim 4, wherein selecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as the depth estimate for the given pixel location in the image from the reference viewpoint further comprises selecting the depth from the plurality of depths at which the spatially filtered cost function for the identified corresponding pixels in each of the plurality of color channels indicates the highest level of similarity.
  - 6. The method of claim 2, wherein the cost function is an aggregated cost function CV(x,y,d) over each image i within the images from the set of images that are in the same color channel as a reference image, where the cost function includes the following term
  - 7. The method of claim 6, wherein the individual costs Cost^i,Ref(x,y,d) are computed based on each disparity hypothesis d for the pixel location (x, y) in the image from the reference viewpoint (Ref) for the image i in the set of images in the same color channel as a reference image as follows:
    - Cost^i,Ref(x,y,d)=S{Iⁱ(x,y,d),I^Ref(x,y,d)}where S is a similarity measure, andIⁱis the normalized image i from the set of images.
  - 8. The method of claim 7, wherein the aggregated cost function is spatially filtered using a filter so that the weighted aggregated cost function is as follows:
  - 9. The method of claim 8, wherein the filter is a box filter and wd and wr are constant coefficients.
  - 10. The method of claim 8, wherein the filter is a bilateral filter and wd and wr are both Gaussian weighting functions.
  - 11. The method of claim 8, wherein a depth estimate for a pixel location (x, y) in the image from the reference viewpoint is determined by selecting the depth that minimizes the filtered cost at each pixel location in the depth map as follows:
    - D(x,y)=argmin{FilteredCV(x,y,d)}.
  - 12. The method of claim 2, wherein the cost function incorporates the L1 norm of pixels from the multiple color channels.
  - 13. The method of claim 2, wherein the cost function incorporates the L2 norm of pixels from the multiple color channels.
  - 14. The method of claim 2, wherein:
    - the set of images are captured within multiple color channels including at least Red, Green and Blue color channels;
      
      selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints comprises selecting one of the images in the Green color channel as a Green reference image and selecting the viewpoint of the Green reference image as the reference viewpoint; and
      
      the cost function Cost(x,y,d) for a pixel location (x, y) in the image from the reference viewpoint at a depth d is;
  - 15. The method of claim 14, wherein the Cost_G(x,y,d) uses a similarity measure selected from the group consisting of an L1 norm, an L2 norm, and variance across the pixels in the images in the set of images that are within the Green color channel.
  - 16. The method of claim 14, wherein the cost measures for the Red (Cost_R(x,y,d)) and Blue color channels (Cost_B(x,y,d)) are determined by calculating the aggregated difference between unique pairs of corresponding pixels in images within the color channel.
  - 17. The method of claim 16, wherein calculating the aggregated difference between each unique pair of corresponding pixels in images within a color channel comprises determining a combination cost metric for unique pairs of corresponding pixels in images within the color channel.
  - 18. The method of claim 17, wherein the combination cost metric (Cost_R(x,y,d)) for a Red color channel including four images (R_A, R_B, R_C, and R_D) can be determined as follows:
  - 19. The method of claim 17, wherein the combination cost metric is determined utilizing at least one selected from the group consisting of:
    - the L1 norm of the pixel brightness values;
      
      the L2 norm of the pixel brightness values; and
      
      the variance in the pixel brightness values.
  - 20. The method of claim 14, wherein the weighting factors γ
    - _G, γ
      
      _R, and γ
      
      _Bare fixed.
  - 21. The method of claim 14, wherein the weighting factors γ
    - _G, γ
      
      _R, and γ
      
      _Bvary spatially with the pixel location (x, y) in the image from the reference viewpoint.
  - 22. The method of claim 21, wherein:
    - the weighting factors γ
      
      _G, γ
      
      _R, and γ
      
      _Bvary based upon the estimated signal-to-noise ratio (SNR) at the pixel location (x, y) in the image from the reference viewpoint; and
      
      strong SNR at the pixel location (x, y) in the image from the reference viewpoint is used to reduce the weighting applied to the Red and Blue color channels.

23. A method of synthesizing a higher resolution image from a light field comprising a set of lower resolution images captured from different viewpoints and within multiple color channels, the method comprising:
- estimating distances to objects within a scene from the light field comprising a set of images captured from different viewpoints and within multiple color channels using a processor configured by an image processing application, the method comprising;
  
  selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints and within multiple color channels;
  
  normalizing the set of images to increase the similarity of pixels from the set of images that correspond in a specific color channel from the multiple color channels;
  
  determining depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where a depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  in each of a plurality of color channels selected from the multiple color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpointdetermining the visibility of the pixels in the set of images from the reference viewpoint by;
  
  identifying corresponding pixels in the set of images using the depth estimates; and
  
  determining that a pixel in a given image is not visible in the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels within the color channel to which the pixel belongs; and
  
  fusing pixels from the set of images using the processor configured by the image processing application based upon the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by;
  
  identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the visibility information;
  
  applying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the current depth estimates; and
  
  fusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images.
- View Dependent Claims (24)
- - 24. The method of claim 23, further comprising synthesizing an image from the reference viewpoint using the processor configured by the image processing application to perform a super-resolution process based upon the fused image from the reference viewpoint, at least a subset of the set of images captured from different viewpoints, the current depth estimates, and the visibility information.

25. An image processing system, comprising:
- a processor;
  
  memory containing a set of images captured from different viewpoints and within multiple color channels and an image processing application;
  
  wherein the image processing application configures the processor to;
  
  select a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints and within multiple color channels;
  
  normalize the set of images to increase the similarity of corresponding pixels within the set of images that correspond in a specific color channel from the multiple color channels;
  
  determine depth estimates for pixel locations in an image from the reference viewpoint using at least a subset of the set of images, where a depth estimate for a given pixel location in the image from the reference viewpoint is determined by;
  
  identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;
  
  in each of a plurality of color channels selected from the multiple color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; and
  
  selecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
- View Dependent Claims (26, 27, 28, 29, 30)
- - 26. The image processing system of claim 25, wherein the image processing application further configures the processor to:
    - determine the visibility of the pixels in the set of images from the reference viewpoint by;
      
      identifying corresponding pixels in the set of images using the finalized depth estimates; and
      
      determining that a pixel in a given image is not visible in the reference viewpoint when the pixel fails a photometric similarity criterion determined based upon a comparison of corresponding pixels within the color channel to which the pixel belongs; and
      
      fuse pixels from the set of images using the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by;
      
      identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the visibility information;
      
      applying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the finalized depth estimates; and
      
      fusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images.
  - 27. The image processing system of claim 25, where the image processing system is part of an array camera that further comprises a plurality of cameras having different viewpoints of a scene and the image processing application configures the processor to capture the plurality of images using the plurality of cameras.
  - 28. The image processing system of claim 27, where the plurality of cameras include a plurality of different color filters so that the captured plurality of images form multiple color channels.
  - 29. The image process system of claim 28, wherein the multiple color channels include Red, Green and Blue color channels.
  - 30. The image processing system of claim 28, wherein each of the plurality of cameras includes a filter selected from the group consisting of one or more Blue filters, one or more Green filters, one or more Red filters, one or more shifted spectral filters, one or more near-IR filters, and one or more hyper-spectral filters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
FotoNation Limited (Adeia Inc.)
Original Assignee
Pelican Imaging Corporation
Inventors
Ciurea, Florian, Venkataraman, Kartik, Molina, Gabriel, Lelescu, Dan
Primary Examiner(s)
Tung, Kee M
Assistant Examiner(s)
Du, Haixia

Application Number

US14/144,458
Time in Patent Office

197 Days
Field of Search

345/427, 382/154
US Class Current

345/427
CPC Class Codes

G02B 27/0075   with means for altering, e....

G06T 15/20   Perspective computation

G06T 2200/21   involving computational pho...

G06T 2207/10012   Stereo images

G06T 2207/10024   Color image

G06T 2207/10052   Images from lightfield camera

G06T 7/557   from light fields, e.g. fro...

G06T 7/593   from stereo images

G06T 7/85   Stereo camera calibration

H04N 13/128   Adjusting depth or disparity

H04N 13/232   using fly-eye lenses, e.g. ...

H04N 13/243   using three or more 2D imag...

H04N 2013/0081   Depth or disparity estimati...

H04N 2013/0088   Synthesising a monoscopic i...

H04N 23/16   Optical arrangements associ...

Systems and methods for performing depth estimation using image data from multiple spectral channels

First Claim

13 Assignments

0 Petitions

Accused Products

Abstract

209 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for performing depth estimation using image data from multiple spectral channels

First Claim

13 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

209 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links