Depth map generation using motion cues for conversion of monoscopic visual content to stereoscopic 3D

US 9,661,307 B1
Filed: 10/01/2012
Issued: 05/23/2017
Est. Priority Date: 11/15/2011
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

identifying a subset of frames in a two-dimensional video, the subset of frames comprising moving objects and static objects;

determining a global camera motion value for the subset of frames;

determining a dense motion value for a plurality of pixels in the subset of frames;

comparing, by a processing device, the global camera motion value and the dense motion value to calculate a local motion value for each of the plurality of pixels;

determining, for each of the plurality of pixels, whether a local motion value of a respective pixel satisfies a first threshold condition or a second threshold condition;

responsive to the local motion value of the respective pixel satisfying the first threshold condition, assigning a first value to represent the local motion value of the respective pixel, and responsive to the local motion value of the respective pixel satisfying the second threshold condition, assigning a second value to represent the local motion value of the respective pixel, wherein the first value indicates that the respective pixel is associated with one of the moving objects, and the second value indicates that a corresponding pixel is associated with one of the static objects, wherein the assigning of the first and second values results in the plurality of pixels each being assigned either the first value or the second value;

generating a rough depth map for the subset of frames using assigned first and second values of the plurality of pixels and locations of the plurality of pixels in the subset of the frames;

interpolating, based on the rough depth map, a depth value for each of the plurality of pixels in the subset of frames; and

rendering a three-dimensional video from the subset of frames using the depth value for each of the plurality of pixels.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An image converter identifies a subset of frames in a two-dimensional video and determines a global camera motion value for the subset of frames. The image converter also determines a dense motion value for a plurality of pixels in the subset of frames and compares the global camera motion value and the dense motion value to calculate a rough depth map for the subset of frames. The image converter further interpolates, based on the rough depth map, a depth value for each of the plurality of pixels in the subset of frames and renders a three-dimensional video from the subset of frames using the depth value for each of the plurality of pixels.

Citations

20 Claims

1. A method comprising:
- identifying a subset of frames in a two-dimensional video, the subset of frames comprising moving objects and static objects;
  
  determining a global camera motion value for the subset of frames;
  
  determining a dense motion value for a plurality of pixels in the subset of frames;
  
  comparing, by a processing device, the global camera motion value and the dense motion value to calculate a local motion value for each of the plurality of pixels;
  
  determining, for each of the plurality of pixels, whether a local motion value of a respective pixel satisfies a first threshold condition or a second threshold condition;
  
  responsive to the local motion value of the respective pixel satisfying the first threshold condition, assigning a first value to represent the local motion value of the respective pixel, and responsive to the local motion value of the respective pixel satisfying the second threshold condition, assigning a second value to represent the local motion value of the respective pixel, wherein the first value indicates that the respective pixel is associated with one of the moving objects, and the second value indicates that a corresponding pixel is associated with one of the static objects, wherein the assigning of the first and second values results in the plurality of pixels each being assigned either the first value or the second value;
  
  generating a rough depth map for the subset of frames using assigned first and second values of the plurality of pixels and locations of the plurality of pixels in the subset of the frames;
  
  interpolating, based on the rough depth map, a depth value for each of the plurality of pixels in the subset of frames; and
  
  rendering a three-dimensional video from the subset of frames using the depth value for each of the plurality of pixels.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the global camera motion value represents a movement of a camera that captured the two-dimensional video.
  - 3. The method of claim 1, wherein the dense motion value represents a total movement of an object represented by each of the plurality of pixels.
  - 4. The method of claim 1, wherein comparing the global camera motion value and the dense motion value comprises determining a difference between the global camera motion value and the dense motion value to identify the local motion value for each of the plurality of pixels representing a movement of an object within video relative to a camera.
  - 5. The method of claim 4, further comprising:
    - applying a threshold to the local motion value for each of the plurality of pixels.
  - 6. The method of claim 1, wherein interpolating the depth value of each pixel comprises:
    - computing, a feature-to-depth mapping function based on the rough depth map; and
      
      applying the feature-to-depth mapping function to the plurality of pixels in the subset of frames.
  - 7. The method of claim 1, further comprising:
    - refining the rough depth map by applying a Markov Random Field (MRF) algorithm to the rough depth map.

8. A non-transitory machine-readable storage medium storing instructions which, when executed, cause a data processing system to perform a method comprising:
- identifying a subset of frames in a two-dimensional video, the subset of frames comprising moving objects and static objects;
  
  determining a global camera motion value for the subset of frames;
  
  determining a dense motion value for a plurality of pixels in the subset of frames;
  
  comparing, by a processing device, the global camera motion value and the dense motion value to calculate a local motion value for each of the plurality of pixels;
  
  determining, for each of the plurality of pixels, whether a local motion value of a respective pixel satisfies a first threshold condition or a second threshold condition;
  
  responsive to the local motion value of the respective pixel satisfying the first threshold condition, assigning a first value to represent the local motion value of the respective pixel, and responsive to the local motion value of the respective pixel satisfying the second threshold condition, assigning a second value to represent the local motion value of the respective pixel, wherein the first value indicates that the respective pixel is associated with one of the moving objects, and the second value indicates that a corresponding pixel is associated with one of the static objects, wherein the assigning of the first and second values results in the plurality of pixels each being assigned either the first value or the second value;
  
  generating a rough depth map for the subset of frames using assigned first and second values of the plurality of pixels and locations of the plurality of pixels in the subset of the frames;
  
  interpolating, based on the rough depth map, a depth value for each of the plurality of pixels in the subset of frames; and
  
  rendering a three-dimensional video from the subset of frames using the depth value for each of the plurality of pixels.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The non-transitory machine-readable storage medium of claim 8, wherein the global camera motion value represents a movement of a camera that captured the two-dimensional video.
  - 10. The non-transitory machine-readable storage medium of claim 8, wherein the dense motion value represents a total movement of an object represented by each of the plurality of pixels.
  - 11. The non-transitory machine-readable storage medium of claim 8, wherein comparing the global camera motion value and the dense motion value comprises determining a difference between the global camera motion value and the dense motion value to identify the local motion value for each of the plurality of pixels representing a movement of an object within video relative to a camera.
  - 12. The non-transitory machine-readable storage medium of claim 11, the method further comprising:
    - applying a threshold to the local motion value for each of the plurality of pixels.
  - 13. The non-transitory machine-readable storage medium of claim 8, wherein interpolating the depth value of each pixel comprises:
    - computing, a feature-to-depth mapping function based on the rough depth map; and
      
      applying the feature-to-depth mapping function to the plurality of pixels in the subset of frames.
  - 14. The non-transitory machine-readable storage medium of claim 8, the method further comprising:
    - refining the rough depth map by applying a Markov Random Field (MRF) algorithm to the rough depth map.

15. A system comprising:
- a processing device; and
  
  a memory coupled to the processing device; and
  
  an image converter, executable by the processing device from the memory, to;
  
  identify a subset of frames in a two-dimensional video, the subset of frames comprising moving objects and static objects;
  
  determine a global camera motion value for the subset of frames;
  
  determine a dense motion value for a plurality of pixels in the subset of frames;
  
  compare the global camera motion value and the dense motion value to calculate a local motion value for each of the plurality of pixels;
  
  determine, for each of the plurality of pixels, whether a local motion value of a respective pixel satisfies a first threshold condition or a second threshold condition;
  
  responsive to the local motion value of the respective pixel satisfying the first threshold condition, assign a first value to represent the local motion value of the respective pixel, and responsive to the local motion value of the respective pixel satisfying the second threshold condition, assign a second value to represent the local motion value of the respective pixel, wherein the first value indicates that the respective pixel is associated with one of the moving objects, and the second value indicates that a corresponding pixel is associated with one of the static objects, wherein the assigning of the first and second values results in the plurality of pixels each being assigned either the first value or the second value;
  
  generate a rough depth map for the subset of frames using assigned first and second values of the plurality of pixels and locations of the plurality of pixels in the subset of the frames;
  
  interpolate, based on the rough depth map, a depth value for each of the plurality of pixels in the subset of frames; and
  
  render a three-dimensional video from the subset of frames using the depth value for each of the plurality of pixels.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system of claim 15, wherein the global camera motion value represents a movement of a camera that captured the two-dimensional video.
  - 17. The system of claim 15, wherein the dense motion value represents a total movement of an object represented by each of the plurality of pixels.
  - 18. The system of claim 15, wherein comparing the global camera motion value and the dense motion value comprises determining a difference between the global camera motion value and the dense motion value to identify the local motion value for each of the plurality of pixels representing a movement of an object within video relative to a camera.
  - 19. The system of claim 18, the image converter further to:
    - apply a threshold to the local motion value for each of the plurality of pixels.
  - 20. The system of claim 15, wherein interpolating the depth value of each pixel comprises:
    - computing, a feature-to-depth mapping function based on the rough depth map; and
      
      applying the feature-to-depth mapping function to the plurality of pixels in the subset of frames.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Mukherjee, Debargha, Wu, Chen, Wang, Meng, Xie, Yuchen
Primary Examiner(s)
Patel, Jay
Assistant Examiner(s)
Matt, Marnie

Application Number

US13/632,489
Time in Patent Office

1,695 Days
Field of Search

348 44
US Class Current
CPC Class Codes

G06F 18/00   Pattern recognition

G06T 7/579   from motion

G06V 20/52   Surveillance or monitoring ...

H04N 13/128   Adjusting depth or disparity

H04N 13/261   with monoscopic-to-stereosc...

H04N 13/264   using the relative movement...

H04N 7/18   Closed-circuit television [...

Depth map generation using motion cues for conversion of monoscopic visual content to stereoscopic 3D

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Depth map generation using motion cues for conversion of monoscopic visual content to stereoscopic 3D

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links