System and method for real time 2D to 3D conversion of a video in a digital camera

US 10,237,528 B2
Filed: 03/14/2013
Issued: 03/19/2019
Est. Priority Date: 03/14/2013
Status: Active Grant

First Claim

Patent Images

1. A processor based method, comprising:

employing an image sensing device to receive a first set of two dimensional (2D) images;

employing one or more processors to perform actions, including;

generating a depth map for a scene, the scene comprising a plurality of pixels, each of the pixels corresponding to one of a plurality of depths in the depth map;

identifying a plurality of principal depths for the scene using the generated depth map, each of the principal depths having a number of corresponding pixels that is greater than a number of pixels corresponding to any non-principal depth of the depth map;

identifying a plurality of focus positions, the focus positions respectively corresponding to the principal depths;

capturing, using the image sensing device, a 2D image at each of the focus positions to form a second set of 2D images;

determining a depth correspondence between the second set of 2D images of the scene and the principal depths by associating each coordinate in the second set of 2D images with a principal depth that is closest to an actual depth of the coordinate; and

for each of a plurality of regions of interest in the scene, generating a three dimensional (3D) view including a right-eye image and a left-eye image by;

i) selecting a corresponding principal depth for a respective region of interest, and ii) performing a translation and rotation transformation mapping for each pixel from a 2D image captured at the focus position for the principal depth for the respective region of interest.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Embodiments are directed towards enabling digital cameras to create a 3D view, which can be re-rendered onto any object within a scene, so that it is both in focus and a center of perspective, based on capturing a single set of multiple 2D images of the scene. From capturing a single set of 2D images for a scene, a depth map of the scene may be generated, and used to calculate principal depths, which are then used to capture an image focused at each of the principal depths. A correspondence between a 2D image of the scene and the principal depths are determined that corresponds to a specific principal depth. For different coordinates of the 2D image, different 3D views of the scene are created that are each focused at a principal dept that corresponds to the given coordinate.

Citations

21 Claims

1. A processor based method, comprising:
- employing an image sensing device to receive a first set of two dimensional (2D) images;
  
  employing one or more processors to perform actions, including;
  
  generating a depth map for a scene, the scene comprising a plurality of pixels, each of the pixels corresponding to one of a plurality of depths in the depth map;
  
  identifying a plurality of principal depths for the scene using the generated depth map, each of the principal depths having a number of corresponding pixels that is greater than a number of pixels corresponding to any non-principal depth of the depth map;
  
  identifying a plurality of focus positions, the focus positions respectively corresponding to the principal depths;
  
  capturing, using the image sensing device, a 2D image at each of the focus positions to form a second set of 2D images;
  
  determining a depth correspondence between the second set of 2D images of the scene and the principal depths by associating each coordinate in the second set of 2D images with a principal depth that is closest to an actual depth of the coordinate; and
  
  for each of a plurality of regions of interest in the scene, generating a three dimensional (3D) view including a right-eye image and a left-eye image by;
  
  i) selecting a corresponding principal depth for a respective region of interest, and ii) performing a translation and rotation transformation mapping for each pixel from a 2D image captured at the focus position for the principal depth for the respective region of interest.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein generating a depth map for a scene is based on a pre-capture of the first set of 2D images.
  - 3. The method of claim 1, wherein the one or more processors perform actions, further including:
    - detecting an eye gaze of a user onto the scene, and employing the detected eye gaze to identify a region of interest within the scene; and
      
      when the detected eye gaze is determined to change to another region within the scene, identifying another region of interest within the scene.
  - 4. The method of claim 1, wherein a plurality of regions of interest is determined based on detecting positions of an input device within the scene.
  - 5. The method of claim 1, wherein the principal depths are identified by:
    - dividing the depth map into M by N rectangles, where each rectangle is a coordinate of the depth map, and where a depth is represented by a corresponding focus position and a number of focus positions is a finite number;
      
      generating a histogram function by counting entries depths having a specific value for each focus step; and
      
      selecting from the histogram function a subset of peak values as the principal depths.
  - 6. The method of claim 5, wherein the subset of peak values selected based on a number of rectangles in the depth map.
  - 7. The method of claim 5, wherein the subset of peak values selected based on the number of focus positions.
  - 8. The method of claim 1, wherein the depth map is determined from a depth from focus algorithm.

9. An image system, comprising:
- an image sensing device configured to receive a first set of two dimensional (2D) images; and
  
  one or more circuits having a plurality of components thereon and configured to perform a plurality of actions, including;
  
  generating a depth map for a scene, the scene comprising a plurality of pixels, each of the pixels corresponding to one of a plurality of depths in the depth map;
  
  identifying a plurality of principal depths for the scene using the generated depth map, each of the principal depths having a number of corresponding pixels that is greater than a number of pixels corresponding to any non-principal depth of the depth map;
  
  identifying a plurality of focus positions, the focus positions respectively corresponding to the principal depths;
  
  capturing, using the image sensing device, a 2D image at each of the focus positions to form a second set of 2D images;
  
  determining a depth correspondence between the second set of 2D images of the scene and the principal depths by associating each coordinate in the second set of 2D images with a principal depth that is closest to an actual depth of the coordinate; and
  
  for each of a plurality of regions of interest in the scene, generating a three dimensional (3D) view by;
  
  i) selecting a corresponding principal depth for a respective region of interest, and ii) performing a translation and rotation transformation mapping for each pixel from a 2D image captured at the focus position for the principal depth for the respective region of interest.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The image system of claim 9, wherein the one or more circuits are configured to perform actions, further including:
    - detecting an eye gaze of a user onto the scene, and employing the detected eye gaze to identify a region of interest within the scene; and
      
      when the detected eye gaze is determined to change to another region within the scene, identifying another region of interest within the scene.
  - 11. The image system of claim 9, wherein a plurality of regions of interest is determined based on detecting positions of an input device within the scene.
  - 12. The image system of claim 9, wherein the principal depths are identified by:
    - dividing the depth map into M by N rectangles, where each rectangle is a coordinate of the depth map, and where a depth is represented by a corresponding focus position and a number of focus positions is a finite number;
      
      generating a histogram function by counting entries depths having a specific value for each focus step; and
      
      selecting from the histogram function a subset of peak values as the principal depths.
  - 13. The image system of claim 12, wherein the subset of peak values is selected based on a number of rectangles in the depth map.
  - 14. The image system of claim 12, wherein the subset of peak values is selected based on the number of focus positions.
  - 15. The image system of claim 9, wherein the depth map is generated using a blur quantification of defocused images of the scene.

16. A storage device having stored thereon a plurality of computer-executable instructions that when executed by a digital camera, perform a plurality of actions, comprising:
- generating a depth map for a scene, the scene comprising a plurality of pixels, each of the pixels corresponding to one of a plurality of depths in the depth map;
  
  identifying a plurality of principal depths for the scene using the generated depth map, each of the principal depths having a number of corresponding pixels that is greater than a number of pixels corresponding to any non-principal depth of the depth map;
  
  identifying a plurality of focus positions, the focus positions respectively corresponding to the principal depths;
  
  capturing, using an image sensing device of the digital camera, a 2D image at each of the focus positions;
  
  determining a depth correspondence between the 2D images of the scene and the principal depths by associating each coordinate in the 2D images with a principal depth that is closest to an actual depth of the coordinate; and
  
  for each of a plurality of regions of interest in the scene, generating a three dimensional (3D) view by;
  
  i) selecting a corresponding principal depth for a respective region of interest, and ii) performing a translation and rotation transformation mapping for each pixel from a 2D image captured at the focus position for the principal depth for the respective region of interest.
- View Dependent Claims (17, 18, 19, 20, 21)
- - 17. The storage device of claim 16, wherein the instructions, when executed, cause the digital camera to perform actions, further including:
    - detecting an eye gaze of a user onto the scene, and employing the detected eye gaze to identify a region of interest within the scene; and
      
      when the detected eye gaze is determined to change to another region within the scene, identifying another region of interest within the scene.
  - 18. The storage device of claim 16, wherein a plurality of regions of interest is determined based on detecting positions of an input device within the scene.
  - 19. The storage device of claim 16, wherein the principal depths are identified by:
    - dividing the depth map into M by N rectangles, where each rectangle is a coordinate of the depth map, and where a depth is represented by a corresponding focus position and a number of focus positions is a finite number;
      
      generating a histogram function by counting entries depths having a specific value for each focus step; and
      
      selecting from the histogram function a subset of peak values as the principal depths.
  - 20. The storage device of claim 16, wherein the subset of peak values is selected based on the number of focus positions.
  - 21. The storage device of claim 16 wherein the subset of peak values is selected based on a number of rectangles in the depth map.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Tzur, Meir, Levy, Noam
Primary Examiner(s)
Ren, Zhubing

Application Number

US13/828,226
Publication Number

US 20140267602A1
Time in Patent Office

2,196 Days
Field of Search
US Class Current
CPC Class Codes

H04N 13/122   Improving the 3D impression...

H04N 13/128   Adjusting depth or disparity

H04N 13/139   Format conversion, e.g. of ...

H04N 13/261   with monoscopic-to-stereosc...

H04N 13/383   for tracking with gaze dete...

H04N 2013/0081   Depth or disparity estimati...

System and method for real time 2D to 3D conversion of a video in a digital camera

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for real time 2D to 3D conversion of a video in a digital camera

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links