Method for generating spatial-temporally consistent depth map sequences based on convolution neural networks

US 20190332942A1
Filed: 12/29/2016
Published: 10/31/2019
Est. Priority Date: 12/29/2016
Status: Active Grant

First Claim

Patent Images

1. (canceled)

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for generating spatial-temporal consistency depth map sequences based on convolutional neural networks for 2D-3D conversion of television works includes steps of: 1) collecting a training set, wherein each training sample thereof includes a sequence of continuous RGB images, and a corresponding depth map sequence; 2) processing each image sequence in the training set with spatial-temporal consistency superpixel segmentation, and establishing a spatial similarity matrix and a temporal similarity matrix; 3) establishing the convolution neural network including a single superpixel depth regression network and a spatial-temporal consistency condition random field loss layer; 4) training the convolution neural network; and 5) recovering a depth maps of a RGB image sequence of unknown depth through forward propagation with the trained convolution neural network; which avoids that clue-based depth recovery method is greatly depended on scenario assumptions, and inter-frame discontinuity between depth maps generated by conventional neural networks.

Citations

5 Claims

1. (canceled)

2. :
- A method for generating spatial-temporally consistent depth map sequences based on convolution neural networks, comprising steps of;
  
  1) collecting a training set, wherein each training sample of the training set comprises a continuous RGB (red, green, blue) image sequence of m frames, and a corresponding depth map sequence;
  
  2) processing each image sequence in the training set with spatial-temporal consistency superpixel segmentation, and establishing a spatial similarity matrix S^(s)and a temporal similarity matrix S^(t);
  
  3) building a convolution neural network structure, wherein the convolution neural network comprises a single superpixel depth regression network with a parameter W, and a spatial-temporal consistency condition random field loss layer with a parameter α
  
  ;
  
  4) training the convolution neural network established in the step
  
  3) with the continuous RGB image sequence and the corresponding depth map sequence in the training set, so as to obtain the parameter W and the parameter α
  
  ; and
  
  5) recovering a depth map sequence of a depth-unknown RGB image sequence through forward propagation with the convolution neural network trained;
  
  wherein the step
  
  2) specifically comprises steps of;
  
  (2.1) processing the continuous RGB image sequence in the training set with the spatial-temporal consistency superpixel segmentation, wherein an input sequence is marked as I=[I₁, . . . , I_m], where I_tis a t-th frame of the m frames in total;
  
  the m frames are respectively divided into n₁, . . . , n_msuperpixels by the spatial-temporal consistency superpixel segmentation while a corresponding relation between all superpixels in a later frame and superpixels corresponding to a same object in a former frame is generated;
  
  the whole image sequence comprises n=Σ
  
  _t=1^mn_tsuperpixels;
  
  marking a real depth at a gravity center of each superpixel p as d_p, and defining a ground-truth depth vector of the n superpixels as d=[d₁;
  
  . . . ;
  
  d_n];
  
  (2.2) establishing the spatial similarity matrix S^(s)of the n superpixels, wherein S^(s)is an n×
  
  n matrix;
  
  S_pq^(s)represents a similarity relationship of a superpixel p and a superpixel q in one frame, where;
- View Dependent Claims (3, 4, 5)
- - 3. :
    - The method, as recited in claim 2, wherein in the step
      
      3), the convolution neural network comprises the single superpixel depth regression network and a spatial-temporal consistency condition random field loss layer;
      
      wherein the step
      
      3) specifically comprises steps of;
      
      (3.1) for the single superpixel depth regression network comprising first 31 layers of a VGG16 network, a superpixel pooling layer and three fully connected layers, which performs average pooling in each superpixel space of the superpixel pooling layer;
      
      wherein an input of the network is continuous RGB images of m frames and an output of the network is an n-dimensional vector z=[z₁, . . . z_n], in which a p-th element z_pis an estimated depth value, without considering any constraint, of the superpixel p of the continuous RGB image sequence after the spatial-temporal consistency superpixel segmentation;
      
      a convolution neural network parameter need to be learned is W; and
      
      (3.2) using the output z=[z₁, . . . z_n] of the single superpixel depth regression network obtained in the step (3.1), the real depth vector d=[d₁;
      
      . . . ;
      
      d_n] of the superpixels obtained in the step (2.1), and the spatial similarity matrix S_pq^(s)obtained in the step (2.2) as well as the temporal similarity matrix S_pq^(t)obtained in the step (2.3) as an input of the spatial-temporal consistency condition random field loss layer;
      
      wherein a conditional probability function of a spatial-temporal consistency condition random field is;
  - 4. :
    - The method, as recited in claim 3, wherein in the step
      
      4), training the convolution neural network specifically comprises steps of;
      
      (4.1) optimizing the parameters W, α
      
      ^(s)and α
      
      ^(t)using stochastic gradient descent, wherein for each iteration, the parameters are updated as;
  - 5. :
    - The method, as recited in claim 4, wherein in the step
      
      5), recovering the depth-unknown RGB image sequence specifically comprises steps of;
      
      (5.1) processing the RGB image sequence with the spatial-temporal consistency superpixel segmentation, and calculating the spatial similarity matrix S^(s)and the temporal similarity matrix S^(t);
      
      (5.2) applying the forward propagation to the RGB image sequence with the convolution neural network trained, so as to obtain a single superpixel network output z;
      
      (5.3) calculating a depth output {circumflex over (d)}=[{circumflex over (d)}₁;
      
      . . . ;
      
      {circumflex over (d)}_n] with spatial-temporal consistency constraint by;
      
      {circumflex over (d)}=L^−
      
      1z wherein the matrix L is calculated in the step (3.2);
      
      {circumflex over (d)}_prepresents an estimated depth value of a superpixel p in the RGB image sequence; and
      
      (5.4) applying {circumflex over (d)}_pto a corresponding position of a corresponding frame of the superpixel p for obtaining a depth map of the m frames.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Inventors
Wang, Xun, Zhao, Xuran

Granted Patent

US 10,540,590 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 17/13   Differential equations usin...

G06F 17/16   Matrix or vector computatio...

G06F 18/22   Matching criteria, e.g. pro...

G06N 3/045   Combinations of networks

G06N 3/047   Probabilistic or stochastic...

G06N 3/084   Backpropagation, e.g. using...

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 7/10   Segmentation; Edge detectio...

G06T 7/593   from stereo images

G06V 10/26   Segmentation of patterns in...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

H04N 13/00   Stereoscopic video systems;...

Method for generating spatial-temporally consistent depth map sequences based on convolution neural networks

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Method for generating spatial-temporally consistent depth map sequences based on convolution neural networks

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links