Depth map estimation with stereo images

US 10,466,714 B2
Filed: 09/01/2016
Issued: 11/05/2019
Est. Priority Date: 09/01/2016
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

inputting and processing first and second stereo images captured by stereo cameras at a same time with one or more deep neural network maximum pooling layers;

processing the stereo images with one or more deep neural network upsampling layers;

determining one or more three-dimensional depth maps from the stereo images by solving for stereo disparity in the first and second stereoscopic images with the one or more deep neural network maximum pooling and with the upsampling layers; and

piloting a vehicle based on the one or more depth maps;

wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are based on cross-correlating a first kernel from the first stereo image with a second kernel from the second stereo image to determine stereo disparity to determine the one or more depth maps;

wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with training stereo images and associated ground-truth depth maps, wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with LIDAR data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Vehicles can be equipped to operate in both autonomous and occupant piloted mode. While operating in either mode, an array of sensors can be used to pilot the vehicle including stereo cameras and 3D sensors. Stereo camera and 3D sensors can also be employed to assist occupants while piloting vehicles. Deep convolutional neural networks can be employed to determine estimated depth maps from stereo images of scenes in real time for vehicles in autonomous and occupant piloted modes.

38 Citations

14 Claims

1. A method, comprising:
- inputting and processing first and second stereo images captured by stereo cameras at a same time with one or more deep neural network maximum pooling layers;
  
  processing the stereo images with one or more deep neural network upsampling layers;
  
  determining one or more three-dimensional depth maps from the stereo images by solving for stereo disparity in the first and second stereoscopic images with the one or more deep neural network maximum pooling and with the upsampling layers; and
  
  piloting a vehicle based on the one or more depth maps;
  
  wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are based on cross-correlating a first kernel from the first stereo image with a second kernel from the second stereo image to determine stereo disparity to determine the one or more depth maps;
  
  wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with training stereo images and associated ground-truth depth maps, wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with LIDAR data.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein the deep neural network maximum pooling layers are convolutional neural network layers.
  - 3. The method of claim 1 wherein the deep neural network upsampling layers are de-convolutional neural network layers.
  - 4. The method of claim 1 wherein the training stereo images are acquired over a plurality of baseline distances and the stereo camera alignments.
  - 5. The method of claim 1 wherein the one or more deep neural network maximum pooling layers include at least four deep neural network maximum pooling layers.
  - 6. The method of claim 1 wherein the one or more deep neural network upsampling layers include at least four deep neural network upsampling layers.
  - 7. The method of claim 1 further comprising:
    - wherein the vehicle is operating in occupant piloted mode; and
      
      piloting a vehicle based on the one or more depth maps includes assisting an occupant pilot with an augmented reality display.

8. An apparatus, comprising:
- a processor; and
  
  a memory, the memory including instructions to be executed by the processor to;
  
  input and process first and second stereo images captured by stereo cameras at a same time with one or more deep neural network maximum pooling layers;
  
  process the stereo images with one or more deep neural network upsampling layers;
  
  determine one or more depth maps by solving for stereo disparity in the first and second stereoscopic images with the one or more deep neural network maximum pooling and with the upsampling layers; and
  
  pilot a vehicle based on the one or more depth maps;
  
  wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are based on cross-correlating a first kernel from the first stereo image with a second kernel from the second stereo image to determine stereo disparity to determine the one or more depth maps;
  
  wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with training stereo images and associated ground-truth depth maps, wherein the deep neural network maximum pooling layers and the deep neural network upsampling layers are trained with LIDAR data.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The apparatus of claim 8 wherein the deep neural network maximum pooling layers are convolutional neural network layers.
  - 10. The apparatus of claim 8 wherein the deep neural network upsampling layers are de-convolutional neural network layers.
  - 11. The apparatus of claim 8 wherein the training stereo images are acquired over a plurality of baseline distances and the stereo camera alignments.
  - 12. The apparatus of claim 8 wherein the one or more deep neural network maximum pooling layers include at least four deep neural network maximum pooling layers.
  - 13. The apparatus of claim 8 wherein the one or more deep neural network upsampling layers include at least four deep neural network upsampling layers.
  - 14. The apparatus of claim 8 further comprising:
    - wherein the vehicle is operating in occupant piloted mode; and
      
      piloting a vehicle based on the one or more depth maps includes assisting an occupant pilot with an augmented reality display.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ford Global Technologies, LLC (Ford Motor Company)
Original Assignee
Ford Global Technologies, LLC (Ford Motor Company)
Inventors
Taimouri, Vahid, Cordonnier, Michel, Lee, Kyoung Min, Goodman, Bryan Roger, Puskorius, Gintaras Vincent
Primary Examiner(s)
Cass, Jean Paul

Application Number

US15/254,212
Publication Number

US 20180059679A1
Time in Patent Office

1,160 Days
Field of Search

701 27
US Class Current
CPC Class Codes

G05D 1/0088   characterized by the autono...

G05D 1/0248   in combination with a laser...

G06N 3/04   Architecture, e.g. intercon...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06N 3/084   Backpropagation, e.g. using...

G06T 2207/10012   Stereo images

G06T 2207/20084   Artificial neural networks ...

G06T 2207/30252   Vehicle exterior; Vicinity ...

G06T 7/50   Depth or shape recovery

G06T 7/593   from stereo images

Depth map estimation with stereo images

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

38 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Depth map estimation with stereo images

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links