Method of converting 2D video to 3D video using machine learning

US 9,609,307 B1
Filed: 12/14/2015
Issued: 03/28/2017
Est. Priority Date: 09/17/2015
Status: Active Grant

First Claim

Patent Images

1. A machine learning method of converting 2D video to 3D video, comprising:

obtaining a training set comprising a plurality of conversions, each conversion comprising a 2D scene comprising one or more 2D frames;

a corresponding 3D conversion dataset that describes conversion of said 2D scene to 3D, comprising inputs and outputs for 2D to 3D conversion steps, said 2D to 3D conversion steps comprising obtaining said one or more 2D frames;

locating and identifying an object in one or more object frames within said one or more 2D frames, each object frame containing an image of at least a portion of said object;

generating an object mask for said object in said one or more object frames, said object mask identifying one or more masked pixels representing said object in said one or more object frames;

generating an object depth model that assigns a pixel depth to one or more of said one or more masked pixels;

generating a stereoscopic image pair for each of said one or more object frames based on said object depth model, said stereoscopic image pair comprising a left image and a right image; and

,generating one or more gap filling pixel values for one or more missing pixels in said left image or in said right image;

training a machine learning system on said training set;

obtaining a 2D video;

applying said machine learning system to said 2D video to automatically perform one or more of said 2D to 3D conversion steps on said 2D video; and

,accepting input from an operator to modify or complete one or more of said 2D to 3D conversion steps on said 2D video.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Machine learning method that learns to convert 2D video to 3D video from a set of training examples. Uses machine learning to perform any or all of the 2D to 3D conversion steps of identifying and locating objects, masking objects, modeling object depth, generating stereoscopic image pairs, and filling gaps created by pixel displacement for depth effects. Training examples comprise inputs and outputs for the conversion steps. The machine learning system generates transformation functions that generate the outputs from the inputs; these functions may then be used on new 2D videos to automate or semi-automate the conversion process. Operator input may be used to augment the results of the machine learning system. Illustrative representations for conversion data in the training examples include object tags to identify objects and locate their features, Bézier curves to mask object regions, and point clouds or geometric shapes to model object depth.

421 Citations

14 Claims

1. A machine learning method of converting 2D video to 3D video, comprising:
- obtaining a training set comprising a plurality of conversions, each conversion comprising a 2D scene comprising one or more 2D frames;
  
  a corresponding 3D conversion dataset that describes conversion of said 2D scene to 3D, comprising inputs and outputs for 2D to 3D conversion steps, said 2D to 3D conversion steps comprising obtaining said one or more 2D frames;
  
  locating and identifying an object in one or more object frames within said one or more 2D frames, each object frame containing an image of at least a portion of said object;
  
  generating an object mask for said object in said one or more object frames, said object mask identifying one or more masked pixels representing said object in said one or more object frames;
  
  generating an object depth model that assigns a pixel depth to one or more of said one or more masked pixels;
  
  generating a stereoscopic image pair for each of said one or more object frames based on said object depth model, said stereoscopic image pair comprising a left image and a right image; and
  
  ,generating one or more gap filling pixel values for one or more missing pixels in said left image or in said right image;
  
  training a machine learning system on said training set;
  
  obtaining a 2D video;
  
  applying said machine learning system to said 2D video to automatically perform one or more of said 2D to 3D conversion steps on said 2D video; and
  
  ,accepting input from an operator to modify or complete one or more of said 2D to 3D conversion steps on said 2D video.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1 whereinsaid machine learning system performs said generating an object mask for said object in said one or more object frames;
    - and,said corresponding 3D conversion dataset comprisesa masking input comprising an identity of said object; and
      
      ,a location of one or more feature points of said object in said one or more object frames; and
      
      ,a masking output comprising a path comprising one or more segments, each segment comprising a curve defined by one or more control points, wherein said path is a boundary of said object mask.
  - 3. The method of claim 1 whereinsaid machine learning system performs said generating an object depth model;
    - and,said corresponding 3D conversion dataset comprises an object depth model input comprising said object mask; and
      
      ,an object depth model output comprising one or more regions within said object mask; and
      
      ,a planar or curved 3D surface associated with each of said one or more regions.
  - 4. The method of claim 1 whereinsaid machine learning system performs said generating an object depth model;
    - and,said corresponding 3D conversion dataset comprises an object depth model input comprising said object mask; and
      
      ,an object depth model output comprising a point cloud of 3D points, each of said 3D points associated with a pixel within said object mask.
  - 5. The method of claim 1 whereinsaid machine learning system performs said generating one or more gap filling pixel values;
    - said generating one or more gap filling pixel values comprises generating a clean plate frame from one or more of said one or more 2D frames; and
      
      ,copying pixel values from said clean plate frame to said one or more missing pixels; and
      
      ,said corresponding 3D conversion dataset comprises a clean plate input comprising one or more of said one or more 2D frames; and
      
      ,a clean plate output comprising said clean plate frame associated with said one or more 2D frames.
  - 6. The method of claim 1 whereinsaid machine learning system performs said generating an object mask for said object in said one or more object frames;
    - said generating an object depth model;
      
      said generating one or more gap filling pixel values; and
      
      ,wherein said generating one or more gap filling pixel values comprises generating a clean plate frame from one or more of said one or more 2D frames; and
      
      ,copying pixel values from said clean plate frame to said one or more missing pixels; and
      
      ,said corresponding 3D conversion dataset comprises a masking input comprising an identity of said object; and
      
      ,a location of one or more feature points of said object in said one or more object frames;
      
      a masking output comprising a path comprising one or more segments, each segment comprising a curve defined by one or more control points, wherein said path is a boundary of said object mask;
      
      an object depth model input comprising said object mask;
      
      an object depth model output comprising one or more of a region model comprising one or more regions within said object mask;
      
      a planar or curved 3D surface associated with each of said one or more regions; and
      
      ,a point cloud of 3D points, each of said 3D points associated with a pixel within said object mask;
      
      a clean plate input comprising one or more of said one or more 2D frames; and
      
      ,a clean plate output comprising said clean plate frame associated with said one or more 2D frames.
  - 7. The method of claim 1, whereinsaid generating an object mask for said object in said one or more object frames comprises defining a 3D space associated with said one or more object frames;
    - obtaining a 3D object model of said object; and
      
      ,defining a position and orientation of said 3D object model in said 3D space that aligns said 3D object model with said image of at least a portion of said object in said one or more object frames; and
      
      ,said assigns a pixel depth to one or more of said one or more masked pixels comprises associates a point in said 3D object model in said 3D space with each masked pixel; and
      
      ,assigns a depth of said point in said 3D space to said pixel depth for the associated masked pixel.
  - 8. The method of claim 7, wherein said obtaining a 3D object model of said object comprisesobtaining 3D scanner data captured from said object;
    - and,converting said 3D scanner data into said 3D object model.
  - 9. The method of claim 8, wherein said obtaining said 3D scanner data comprises obtaining data from a time-of-flight system or a light-field system.
  - 10. The method of claim 8, wherein said obtaining said 3D scanner data comprises obtaining data from a triangulation system.
  - 11. The method of claim 8, wherein said converting said 3D scanner data into said 3D object model comprisesretopologizing said 3D scanner data to form said 3D object model from a reduced number of polygons or parameterized surfaces.
  - 12. The method of claim 7, further comprisingdividing said 3D object model into object parts, wherein said object parts may have motion relative to one another;
    - augmenting said 3D object model with one or more degrees of freedom that reflect said motion relative to one another of said object parts; and
      
      ,determining values of each of said one or more degrees of freedom that align said image of said at least a portion of said object in a plurality of frames of said one or more object frames with said 3D object model modified by said values of said one or more degrees of freedom.
  - 13. The method of claim 12, wherein said determining values of each of said one or more degrees of freedom comprisesselecting one or more features in each of said object parts, each having coordinates in said 3D object model;
    - determining pixel locations of said one or more features in said one or more object frames; and
      
      ,calculating a position and orientation of one of said object parts and calculating said values of each of said one or more degrees of freedom to align a projection of said coordinates in said 3D model onto a camera plane with said pixel locations in said one or more object frames.
  - 14. The method of claim 13, wherein said determining pixel locations of said one or more features in said one or more object frames comprisesselecting said pixel locations in one or more key frames;
    - and,tracking said features across one or more non-key frames using a computer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
USFT Patents Incorporated
Original Assignee
Legend3D, Inc.
Inventors
Lopez, Anthony, McFarland, Jacqueline, Baldridge, Tony
Primary Examiner(s)
Patel, Jayesh A

Application Number

US14/967,939
Publication Number

US 20170085863A1
Time in Patent Office

470 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06T 2207/20081   Training; Learning

G06T 7/557   from light fields, e.g. fro...

H04N 13/122   Improving the 3D impression...

H04N 13/128   Adjusting depth or disparity

H04N 13/261   with monoscopic-to-stereosc...

Method of converting 2D video to 3D video using machine learning

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

421 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method of converting 2D video to 3D video using machine learning

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

421 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links