TRUNCATED SQUARE PYRAMID GEOMETRY AND FRAME PACKING STRUCTURE FOR REPRESENTING VIRTUAL REALITY VIDEO CONTENT

US 20170280126A1
Filed: 08/31/2016
Published: 09/28/2017
Est. Priority Date: 03/23/2016
Status: Active Grant

First Claim

Patent Images

1. A method for encoding video data, comprising:

obtaining virtual reality video data, wherein the virtual reality video data represents a 360-degree view of a virtual environment, wherein the virtual reality video data includes a plurality of frames, and wherein each frame from the plurality of frames includes corresponding spherical video data; and

mapping the spherical video data for a frame from the plurality of frames onto planes of a truncated square pyramid, wherein the planes of the truncated square pyramid include a base plane, a top plane, a left-side plane, a right-side plane, an up-side plane, and a bottom-side plane, wherein a size of the top plane is less than a size of the base plane, and wherein mapping the spherical video data includes;

mapping a first portion of the spherical video data onto the base plane at full resolution;

mapping a second portion of the spherical video data onto the top plane at a reduced resolution;

mapping a third portion of the spherical video data onto the left-side plane at a decreasing resolution;

mapping a fourth portion of the spherical video data onto the right-side plane at a decreasing resolution;

mapping a fifth portion of the spherical video data onto the up-side plane at a decreasing resolution; and

mapping a sixth portion of the spherical video data onto the bottom-side plane at a decreasing resolution.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques and systems are described for mapping 360-degree video data to a truncated square pyramid shape. A 360-degree video frame can include 360-degrees'"'"' worth of pixel data, and thus be spherical in shape. By mapping the spherical video data to the planes provided by a truncated square pyramid, the total size of the 360-degree video frame can be reduced. The planes of the truncated square pyramid can be oriented such that the base of the truncated square pyramid represents a front view and the top of the truncated square pyramid represents a back view. In this way, the front view can be captured at full resolution, the back view can be captured at reduced resolution, and the left, right, up, and bottom views can be captured at decreasing resolutions. Frame packing structures can also be defined for 360-degree video data that has been mapped to a truncated square pyramid shape.

71 Citations

30 Claims

1. A method for encoding video data, comprising:
- obtaining virtual reality video data, wherein the virtual reality video data represents a 360-degree view of a virtual environment, wherein the virtual reality video data includes a plurality of frames, and wherein each frame from the plurality of frames includes corresponding spherical video data; and
  
  mapping the spherical video data for a frame from the plurality of frames onto planes of a truncated square pyramid, wherein the planes of the truncated square pyramid include a base plane, a top plane, a left-side plane, a right-side plane, an up-side plane, and a bottom-side plane, wherein a size of the top plane is less than a size of the base plane, and wherein mapping the spherical video data includes;
  
  mapping a first portion of the spherical video data onto the base plane at full resolution;
  
  mapping a second portion of the spherical video data onto the top plane at a reduced resolution;
  
  mapping a third portion of the spherical video data onto the left-side plane at a decreasing resolution;
  
  mapping a fourth portion of the spherical video data onto the right-side plane at a decreasing resolution;
  
  mapping a fifth portion of the spherical video data onto the up-side plane at a decreasing resolution; and
  
  mapping a sixth portion of the spherical video data onto the bottom-side plane at a decreasing resolution.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, further comprising:
    - packing the spherical video data into a rectangular format.
  - 3. The method of claim 1, further comprising:
    - packing the spherical the video data into a packing structure, wherein the packing includes;
      
      packing the third portion, the fourth portion, the fifth portion, and the sixth portion of the spherical video data around the second portion in a first data block;
      
      packing first portion into a second data block; and
      
      packing the first data block and the second data block into the packing structure, wherein the first data block is positioned next to the second data block in the packing structure.
  - 4. The method of claim 1, further comprising:
    - packing the spherical the video data into a packing structure, wherein the packing includes;
      
      packing a first half of the fifth portion, a first half of the sixth portion, and the third portion of the spherical video data around a first half of the second portion in a first data block;
      
      packing a second half of the fifth portion, a second half of the sixth portion, and the fourth portion of the spherical video data around a second half of the second portion in a second data block;
      
      packing the first portion of the spherical video data into a third data block;
      
      packing the first data block, the second data block, and the third data block into the packing structure, wherein the first data block and the second data block are positioned next to the third data block in the packing structure.
  - 5. The method of claim 1, further comprising:
    - transmitting a first frame from the plurality of frames, wherein video data for the first frame is mapped to planes of a first truncated square pyramid; and
      
      transmitting a second frame from the plurality of frames, wherein video data for the second frame is mapped to planes of a second truncated square pyramid, and wherein the second truncated square pyramid is rotated relative to the first truncated square pyramid.
  - 6. The method of claim 1, wherein the truncated square pyramid further includes a rectangular left-side plane adjacent to the left-side plane, a rectangular right-side plane adjacent to the right-side plane, a rectangular up-side plane adjacent to the up-side plane, and a rectangular bottom-side plane adjacent to the bottom-side plane, and wherein mapping the spherical video data further includes:
    - mapping a seventh portion of the spherical video data onto the rectangular left-side plane at full resolution;
      
      mapping an eighth portion of the spherical video data onto the rectangular right-side plane at full resolution;
      
      mapping a ninth portion of the spherical video data onto the rectangular up-side plane at full resolution; and
      
      mapping a tenth portion of the spherical video data onto the rectangular bottom-side plane at full resolution.
  - 7. The method of claim 1, further comprising:
    - defining a geometry type for the truncated square pyramid, wherein the geometry type identifies a geometric shape for mapping the spherical video data to a file format;
      
      defining a height for the truncated square pyramid;
      
      defining a back width for the truncated square pyramid, wherein the back width is associated with the top plane; and
      
      defining a back height for the truncated square pyramid, wherein the back height is associated with the top plane.
  - 8. The method of claim 1, further comprising:
    - defining a virtual reality (VR) mapping type for the truncated square pyramid, wherein the VR mapping type indicates a mapping type for mapping the spherical video data to a rectangular format, and wherein the VR mapping type for the truncated square pyramid is associated with a video information box.

9. A device for encoding video data, comprising:
- a memory configured to store video data; and
  
  a video encoding device in communication with the memory, wherein the video encoding device is configured to;
  
  obtain virtual reality video data, wherein the virtual reality video data represents a 360-degree view of a virtual environment, wherein the virtual reality video data includes a plurality of frames, and wherein each frame from the plurality of frames includes corresponding spherical video data; and
  
  map the spherical video data for a frame from the plurality of frames onto planes of a truncated square pyramid, wherein the planes of the truncated square pyramid include a base plane, a top plane, a left-side plane, a right-side plane, an up-side plane, and a bottom-side plane, wherein a size of the top plane is less than a size of the base plane, and wherein mapping the spherical video data includes;
  
  mapping a first portion of the spherical video data onto the base plane at full resolution;
  
  mapping a second portion of the spherical video data onto the top plane at a reduced resolution;
  
  mapping a third portion of the spherical video data onto the left-side plane at a decreasing resolution;
  
  mapping a fourth portion of the spherical video data onto the right-side plane at a decreasing resolution;
  
  mapping a fifth portion of the spherical video data onto the up-side plane at a decreasing resolution; and
  
  mapping a sixth portion of the spherical video data onto the bottom-side plane at a decreasing resolution.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 10. The device of claim 9, wherein the video encoding device is further configured to:
    - pack the spherical video data into a rectangular format.
  - 11. The device of claim 9, wherein the video encoding device is further configured to pack the spherical video data into a packing structure, wherein the packing includes:
    - packing the third portion, the fourth portion, the fifth portion, and the sixth portion of the spherical video data around the second portion in a first data block;
      
      packing first portion into a second data block; and
      
      packing the first data block and the second data block into the packing structure, wherein the first data block is positioned next to the second data block in the packing structure.
  - 12. The device of claim 9, wherein the video encoding device is further configured to pack the spherical video data into a packing structure, wherein the packing includes:
    - packing a first half of the fifth portion, a first half of the sixth portion, and the third portion of the spherical video data around a first half of the second portion in a first data block;
      
      packing a second half of the fifth portion, a second half of the sixth portion, and the fourth portion of the spherical video data around a second half of the second portion in a second data block;
      
      packing the first portion of the spherical video data into a third data block;
      
      packing the first data block, the second data block, and the third data block into the packing structure, wherein the first data block and the second data block are positioned next to the third data block in the packing structure.
  - 13. The device of claim 9, wherein the video encoding device is further configured to:
    - transmit a first frame from the plurality of frames, wherein video data for the first frame is mapped to planes of a first truncated square pyramid; and
      
      transmit a second frame from the plurality of frames, wherein video data for the second frame is mapped to planes of a second truncated square pyramid, and wherein the second truncated square pyramid is rotated relative to the first truncated square pyramid.
  - 14. The device of claim 9, wherein the truncated square pyramid further includes a rectangular left-side plane adjacent to the left-side plane, a rectangular right-side plane adjacent to the right-side plane, a rectangular up-side plane adjacent to the up-side plane, and a rectangular bottom-side plane adjacent to the bottom-side plane, and wherein mapping the spherical video data includes:
    - mapping a seventh portion of the spherical video data onto the rectangular left-side plane at full resolution;
      
      mapping an eighth portion of the spherical video data onto the rectangular right-side plane at full resolution;
      
      mapping a ninth portion of the spherical video data onto the rectangular up-side plane at full resolution; and
      
      mapping a tenth portion of the spherical video data onto the rectangular bottom-side plane at full resolution.
  - 15. The device of claim 9, wherein mapping the spherical video data for the frame onto the planes of the truncated square pyramid includes:
    - selecting video data from the spherical video data; and
      
      locating a position for the selected video data on a corresponding plane from the planes of the truncated square pyramid.
  - 16. The device of claim 9, wherein mapping the spherical video data from the frame onto the planes of the truncated square pyramid includes:
    - selecting video data from the spherical video data;
      
      downsampling the selected video data; and
      
      locating a position for the downsampled video data on a corresponding plane from the planes of the truncated square pyramid.
  - 17. The device of claim 9, wherein the video encoding device is further configured to:
    - define a geometry type for the truncated square pyramid, wherein the geometry type identifies a geometric shape for mapping the spherical video data to a file format;
      
      define a height for the truncated square pyramid;
      
      define a back width for the truncated square pyramid, wherein the back width is associated with the top plane; and
      
      define a back height for the truncated square pyramid, wherein the back height is associated with the top plane.
  - 18. The device of claim 17, wherein the video encoding device is further configured to:
    - define a surface identifier, wherein the surface identifier identifies a plane of the truncated square pyramid;
      
      define a top-left horizontal coordinate for each of plane of the truncated square pyramid, wherein the top-left horizontal coordinate indicates a horizontal location of a top-left corner of the plane within a packing structure, and wherein the packing structure is used to map the spherical video data to the file format;
      
      define a top-left vertical coordinate for each plane of the truncated square pyramid, wherein the top-left vertical coordinate indicates a vertical coordinate of the top-left corner of the plane within the packing structure;
      
      define an area width for each plane of the truncated square pyramid, wherein the area width is associated with a width of the plane; and
      
      define an area height for each plane of the truncated square pyramid, wherein the area height is associated with a height of the plane.
  - 19. The device of claim 9, wherein the video encoding device is further configured to:
    - define a virtual reality (VR) mapping type for the truncated square pyramid, wherein the VR mapping type indicates a mapping type for mapping the spherical video data to a rectangular format, and wherein the VR mapping type for the truncated square pyramid is associated with a video information box.
  - 20. The device of claim 19, wherein the video information box includes:
    - a depth indicating a depth of the truncated square pyramid;
      
      a back width indicating a width of the top plane;
      
      a back height indicating a height of the top plane;
      
      a region identifier identifying a plane from the planes of the truncated square pyramid;
      
      a center pitch indicating a pitch angle of a coordinate of a point to which a center pixel of the spherical video data is rendered;
      
      a center yaw indicating a yaw angle of the coordinate of the point to which the center pixel of the spherical video data is rendered;
      
      a center pitch offset indicating an offset value of the pitch angle the coordinate of the point to which the center pixel of the spherical video data is rendered;
      
      a center yaw offset indicating an offset value of the yaw angle the coordinate of the point to which the center pixel of the spherical video data is rendered;
      
      a top-left horizontal coordinate indicating a horizontal coordinate of a top-left corner of the plane;
      
      a top-left vertical coordinate indicating a vertical coordinate of the top-left corner of the plane;
      
      a region width indicating a width of the plane; and
      
      a region height indicate a height of the plane.

21. A method for decoding video data, comprising:
- obtaining a frame of virtual reality video data, wherein the virtual reality video data represents a 360-degree view of a virtual environment, wherein the frame has a rectangular format;
  
  identifying a frame packing structure for the frame, wherein the frame packing structure provides positions for video data in the frame, wherein the frame packing structure includes planes of a truncated square pyramid, wherein the planes of the truncated square pyramid include a base plane, a top plane, a left-side plane, a right-side plane, an up-side plane, and a bottom-side plane, and wherein a size of the top plane is less than a size of the base plane; and
  
  displaying the frame using the frame packing structure.
- View Dependent Claims (22, 23, 24)
- - 22. The method of claim 21, wherein the frame packing structure further includes a rectangular left-side plane adjacent to the left-side plane, a rectangular right-side plane adjacent to the right-side plane, a rectangular up-side plane adjacent to the up-side plane, and a rectangular bottom-side plane adjacent to the bottom-side plane.
  - 23. The method of claim 21, further comprising:
    - determining a geometry type for the frame, wherein the geometry type identifies a geometric shape for mapping the virtual reality video data to a file format;
      
      determining a height from the truncated square pyramid based on the geometry type;
      
      determining a back width for the truncated square pyramid using the geometry type, wherein the back width is associated with the top plane; and
      
      determining a back height for the truncated square pyramid using the geometry type, wherein the back height is associated with the top plane.
  - 24. The method of claim 21, further comprising:
    - identifying a virtual reality (VR) mapping type, wherein the VR mapping type indicates a mapping type for mapping the virtual reality video data to a rectangular format, wherein the VR mapping type identifies the truncated square pyramid, and wherein the VR mapping type is associated with a video information box.

25. A device for decoding video data, comprising:
- a memory configured to store the video data;
  
  a video decoding device in communication with the memory, wherein the video decoding device is configured to;
  
  obtain a frame of virtual reality video data, wherein the virtual reality video data represents a 360-degree view of a virtual environment, wherein the frame has a rectangular format;
  
  identify a frame packing structure for the frame, wherein the frame packing structure provides positions for video data in the frame, wherein the frame packing structure includes planes of a truncated square pyramid, wherein the planes of the truncated square pyramid include a base plane, a top plane, a left-side plane, a right-side plane, an up-side plane, and a bottom-side plane, wherein a size of the top plane is less than a size of the base plane; and
  
  display the frame using the frame packing structure.
- View Dependent Claims (26, 27, 28, 29, 30)
- - 26. The device of claim 25, wherein displaying the frame includes:
    - providing a first portion of the video data in the frame as a front view, wherein the first portion of the video data corresponds to the base plane, and wherein the first portion of the video data is at full resolution;
      
      providing a second portion of the video data in the frame as a back view, wherein the second portion of the video data corresponds to the top plane, and wherein the second portion of the video data is at a reduced resolution;
      
      providing a third portion of the video data in the frame as a left view, wherein the third portion of the video data corresponds to the left-side plane, and wherein the third portion of the video data is at a decreasing resolution;
      
      providing a fourth portion of the video data in the frame as a right view, wherein the fourth portion of the video data corresponds to the right-side plane, and wherein the fourth portion of the video data is at a decreasing resolution;
      
      providing a fifth portion of the video data in the frame as an up view, wherein the fifth portion of the video data corresponds to the up-side plane, and wherein the fifth portion of the video data is at a decreasing resolution; and
      
      providing a sixth portion of the video data in the frame as a bottom view, wherein the sixth portion of the video data corresponds to the bottom-side plane, and wherein the sixth portion of the video data is at a decreasing resolution.
  - 27. The device of claim 25, wherein the video decoding device is further configured to:
    - receive a second frame of virtual reality data, wherein the second frame is rotated relative to the frame; and
      
      displaying the second frame using the frame packing structure.
  - 28. The device of claim 25, wherein the frame packing structure further includes a rectangular left-side plane adjacent to the left-side plane, a rectangular right-side plane adjacent to the right-side plane, a rectangular up-side plane adjacent to the up-side plane, and a rectangular bottom-side plane adjacent to the bottom-side plane.
  - 29. The device of claim 25, wherein the decoding device is further configured to:
    - determine a geometry type for the frame, wherein the geometry type identifies a geometric shape for mapping the virtual reality video data to a file format;
      
      determine a height from the truncated square pyramid based on the geometry type;
      
      determine a back width for the truncated square pyramid using the geometry type, wherein the back width is associated with the top plane; and
      
      determine a back height for the truncated square pyramid using the geometry type, wherein the back height is associated with the top plane.
  - 30. The device of claim 25, wherein the decoding device is further configured to:
    - identify a virtual reality (VR) mapping type, wherein the VR mapping type indicates a mapping type for mapping the virtual reality video data to a rectangular format, wherein the VR mapping type identifies the truncated square pyramid, and wherein the VR mapping type is associated with a video information box.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Van der Auwera, Geert, Coban, Muhammed, Karczewicz, Marta

Granted Patent

US 10,319,071 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06T 15/00   3D [Three Dimensional] imag...

G06T 19/00   Manipulating 3D models or i...

G06T 2207/10016   Video; Image sequence

G06T 2207/20008   Globally adaptive

G06T 2207/20024   Filtering details

G06T 3/06   Topological mapping of high...

G06T 3/067   Reshaping or unfolding 3D t...

G06T 3/08   Projecting images onto non-...

G06T 3/10   Selection of transformation...

G06T 3/12   Panospheric to cylindrical ...

G06T 3/18   Image warping, e.g. rearran...

G06T 3/4038   Image mosaicing, e.g. compo...

H04N 13/117   the virtual viewpoint locat...

H04N 13/122   Improving the 3D impression...

H04N 13/161   Encoding, multiplexing or d...

H04N 13/279   the virtual viewpoint locat...

H04N 2213/006   Pseudo-stereoscopic systems...

TRUNCATED SQUARE PYRAMID GEOMETRY AND FRAME PACKING STRUCTURE FOR REPRESENTING VIRTUAL REALITY VIDEO CONTENT

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

71 Citations

30 Claims

Specification

Use Cases

Quick Links

Others

TRUNCATED SQUARE PYRAMID GEOMETRY AND FRAME PACKING STRUCTURE FOR REPRESENTING VIRTUAL REALITY VIDEO CONTENT

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

71 Citations

30 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others