Real-time mobile device capture and generation of art-styled AR/VR content

US 10,726,560 B2
Filed: 02/07/2017
Issued: 07/28/2020
Est. Priority Date: 10/31/2014
Status: Active Grant

First Claim

Patent Images

1. A method for generating a three-dimensional (3D) projection of an object in a virtual reality or augmented reality environment, the method including:

obtaining a sequence of images using a single lens camera, the sequence of images being captured along a camera translation, wherein each image in the sequence of images contains at least a portion of overlapping subject matter, the subject matter including the object;

segmenting the object from the sequence of images using a trained segmenting neural network to form a sequence of segmented object images wherein the trained neural network is configured to aggregate a plurality of feature maps from different layers of the trained neural network in order to allow usage of both finer scale and coarser scale details to produce probability maps corresponding to the sequence of segmented object images, wherein the trained neural network is trained to label every pixel in each image in the sequence of images with a particular category label;

refining the sequence of segmented object images using fine-grained segmentation, wherein refining the sequence of segmented object images includes passing each probability map onto a temporal dense conditional random field (CRF) smoothing system to produce a binary mask for every segmented object image, wherein the binary masks are temporally consistent and sharply aligned at boundaries to each other;

applying an art-style transfer to the sequence of segmented object images using a trained transfer neural network;

computing on-the-fly interpolation parameters;

generating stereoscopic pairs from the sequence of segmented object images for displaying the object as a 3D projection in a virtual reality or augmented reality environment, the stereoscopic pairs being generated for one or more points along the camera translation; and

mapping segmented image indices to a rotation range for display in the virtual reality or augmented reality environment.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments describe systems and processes for generating AR/VR content. In one aspect, a method for generating a 3D projection of an object in a virtual reality or augmented reality environment comprises obtaining a sequence of images along a camera translation using a single lens camera. Each image contains a portion of overlapping subject matter, including the object. The object is segmented from the sequence of images using a trained segmenting neural network to form a sequence of segmented object images, to which an art-style transfer is applied using a trained transfer neural network. On-the-fly interpolation parameters are computed and stereoscopic pairs are generated for points along the camera translation from the refined sequence of segmented object images for displaying the object as a 3D projection in a virtual reality or augmented reality environment. Segmented image indices are mapped to a rotation range for display in the virtual reality or augmented reality environment.

101 Citations

View as Search Results

16 Claims

1. A method for generating a three-dimensional (3D) projection of an object in a virtual reality or augmented reality environment, the method including:
- obtaining a sequence of images using a single lens camera, the sequence of images being captured along a camera translation, wherein each image in the sequence of images contains at least a portion of overlapping subject matter, the subject matter including the object;
  
  segmenting the object from the sequence of images using a trained segmenting neural network to form a sequence of segmented object images wherein the trained neural network is configured to aggregate a plurality of feature maps from different layers of the trained neural network in order to allow usage of both finer scale and coarser scale details to produce probability maps corresponding to the sequence of segmented object images, wherein the trained neural network is trained to label every pixel in each image in the sequence of images with a particular category label;
  
  refining the sequence of segmented object images using fine-grained segmentation, wherein refining the sequence of segmented object images includes passing each probability map onto a temporal dense conditional random field (CRF) smoothing system to produce a binary mask for every segmented object image, wherein the binary masks are temporally consistent and sharply aligned at boundaries to each other;
  
  applying an art-style transfer to the sequence of segmented object images using a trained transfer neural network;
  
  computing on-the-fly interpolation parameters;
  
  generating stereoscopic pairs from the sequence of segmented object images for displaying the object as a 3D projection in a virtual reality or augmented reality environment, the stereoscopic pairs being generated for one or more points along the camera translation; and
  
  mapping segmented image indices to a rotation range for display in the virtual reality or augmented reality environment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein each image in the sequence of images is captured at a predetermined distance along the camera translation.
  - 3. The method of claim 1, wherein on-the-fly interpolation parameters are used to generate interpolated images along any point in the camera translation in real-time.
  - 4. The method of claim 1, wherein a stereoscopic pair includes an interpolated virtual image comprising two selected frames.
  - 5. The method of claim 4, wherein the selected frames are modified by rotating an image of a selected frame such that the image corresponds to a view of the object angled directly toward the object.
  - 6. The method of claim 1, wherein the sequence of segmented object images are fused to generate a projection of the object, the projection depicting a 3D view of the object without polygon generation.
  - 7. The method of claim 1, wherein the transfer neural network includes convolution layers, residual layers, and deconvolution layers.
  - 8. The method of claim 1, wherein mapping segmented image indices includes mapping physical viewing locations to a frame index.

9. A system for generating a three-dimensional (3D) projection of an object in a virtual reality or augmented reality environment, the system comprising:
- a single lens camera for obtaining a sequence of images, the sequence of images being captured along a camera translation, wherein each image in the sequence of images contains at least a portion of overlapping subject matter, the subject matter including the object;
  
  a display module;
  
  a processor, andmemory storing one or more programs configured for execution by the processor, the one or more programs comprising instructions for;
  
  segmenting the object from the sequence of images using a trained segmenting neural network to form a sequence of segmented object images wherein the trained neural network is configured to aggregate a plurality of feature maps from different layers of the trained neural network in order to allow usage of both finer scale and coarser scale details to produce probability maps corresponding to the sequence of segmented object images, wherein the trained neural network is trained to label every pixel in each image in the sequence of images with a particular category label;
  
  refining the sequence of segmented object images using fine-grained segmentation, wherein refining the sequence of segmented object images includes passing each probability map onto a temporal dense conditional random field (CRF) smoothing system to produce a binary mask for every segmented object image, wherein the binary masks are temporally consistent and sharply aligned at boundaries to each other;
  
  applying an art-style transfer to the sequence of segmented object images using a trained transfer neural network;
  
  computing on-the-fly interpolation parameters;
  
  generating stereoscopic pairs from the sequence of segmented object images for displaying the object as a 3D projection in a virtual reality or augmented reality environment, the stereoscopic pairs being generated for one or more points along the camera translation; and
  
  mapping segmented image indices to a rotation range for display in the virtual reality or augmented reality environment.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The system of claim 9, wherein each image in the sequence of images is captured at a predetermined distance along the camera translation.
  - 11. The system of claim 9, wherein on-the-fly interpolation parameters are used to generate interpolated images along any point in the camera translation in real-time.
  - 12. The system of claim 9, wherein a stereoscopic pair includes an interpolated virtual image comprising two selected frames.
  - 13. The system of claim 12, wherein the selected frames are modified by rotating an image of a selected frame such that the image corresponds to a view of the object angled directly toward the object.
  - 14. The system of claim 9, wherein the sequence of segmented object images are fused to generate a projection of the object, the projection depicting a 3D view of the object without polygon generation.
  - 15. The system of claim 9, wherein the transfer neural network includes convolution layers, residual layers, and deconvolution layers.

16. A non-transitory computer readable medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
- obtaining a sequence of images using a single lens camera, the sequence of images being captured along a camera translation, wherein each image in the sequence of images contains at least a portion of overlapping subject matter, the subject matter including the object;
  
  segmenting the object from the sequence of images using a trained segmenting neural network to form a sequence of segmented object images wherein the trained neural network is configured to aggregate a plurality of feature maps from different layers of the trained neural network in order to allow usage of both finer scale and coarser scale details to produce probability maps corresponding to the sequence of segmented object images, wherein the trained neural network is trained to label every pixel in each image in the sequence of images with a particular category label;
  
  refining the sequence of segmented object images using fine-grained segmentation, wherein refining the sequence of segmented object images includes passing each probability map onto a temporal dense conditional random field (CRF) smoothing system to produce a binary mask for every segmented object image, wherein the binary masks are temporally consistent and sharply aligned at boundaries to each other;
  
  applying an art-style transfer to the sequence of segmented object images using a trained transfer neural network;
  
  computing on-the-fly interpolation parameters;
  
  generating stereoscopic pairs from the sequence of segmented object images for displaying the object as a 3D projection in a virtual reality or augmented reality environment, the stereoscopic pairs being generated for one or more points along the camera translation; and
  
  mapping segmented image indices to a rotation range for display in the virtual reality or augmented reality environment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fyusion, Inc. (Cox Enterprises Incorporated)
Original Assignee
Fyusion, Inc. (Cox Enterprises Incorporated)
Inventors
Holzer, Stefan Johannes Josef, Ren, Yuheng, Kar, Abhishek, Trevor, Alexander Jay Bruen, Chande, Krunal Ketan, Saelzle, Martin Josef Nikolaus, Rusu, Radu Bogdan
Primary Examiner(s)
Nguyen, Anh-Tuan V

Application Number

US15/426,994
Publication Number

US 20170148222A1
Time in Patent Office

1,267 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 16/532   Query formulation, e.g. gra...

G06F 16/5838   using colour

G06F 16/738   Presentation of query results

G06F 16/783   using metadata automaticall...

G06T 3/4038   Image mosaicing, e.g. compo...

G06T 7/174   involving the use of two or...

G06V 20/20   in augmented reality scenes

G06V 20/70   Labelling scene content, e....

H04N 13/243   using three or more 2D imag...

H04N 13/279   the virtual viewpoint locat...

H04N 13/282   for generating image signal...

H04N 23/698   for achieving an enlarged f...

H04N 5/265   Mixing

Real-time mobile device capture and generation of art-styled AR/VR content

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

101 Citations

16 Claims

Specification

Use Cases

Quick Links

Others

Real-time mobile device capture and generation of art-styled AR/VR content

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

101 Citations

16 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others