Probabilistic model to compress images for three-dimensional video

US 10,440,398 B2
Filed: 06/08/2017
Issued: 10/08/2019
Est. Priority Date: 07/28/2014
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving head-tracking data that describe one or more positions of people while the people are viewing a three-dimensional video;

generating a probabilistic model of the one or more positions of the people based on the head-tracking data, wherein the probabilistic model identifies a probability of a viewer looking in a particular direction as a function of time;

generating video segments from the three-dimensional video based on scene boundaries in the three-dimensional video;

for each of the video segments;

determining a directional encoding format that projects latitudes and longitudes of locations of a surface of a sphere onto locations on a plane;

determining a cost function that identifies a region of interest on the plane based on the probabilistic model; and

generating optimal segment parameters that minimize a sum-over position for the region of interest;

re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and to blur portions of each of the video segments based on the probability, wherein an intensity of a level of blur increases as the probability of the viewer looking in the particular direction decreases; and

providing a re-encoded video and the optimal segment parameters for each of the video segments to a viewing device, wherein the viewing device uses the optimal segment parameters for each of the video segments to un-distort the re-encoded video and texture the re-encoded video to the sphere to display the re-encoded video with the region of interest for each of the video segments displayed at a higher resolution than other regions in each of the video segments.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method includes receiving head-tracking data that describe one or more positions of people while the people are viewing a three-dimensional video. The method further includes generating a probabilistic model of the one or more positions of the people based on the head-tracking data, wherein the probabilistic model identifies a probability of a viewer looking in a particular direction as a function of time. The method further includes generating video segments from the three-dimensional video. The method further includes, for each of the video segments: determining a directional encoding format that projects latitudes and longitudes of locations of a surface of a sphere onto locations on a plane, determining a cost function that identifies a region of interest on the plane based on the probabilistic model, and generating optimal segment parameters that minimize a sum-over position for the region of interest.

Citations

20 Claims

1. A method comprising:
- receiving head-tracking data that describe one or more positions of people while the people are viewing a three-dimensional video;
  
  generating a probabilistic model of the one or more positions of the people based on the head-tracking data, wherein the probabilistic model identifies a probability of a viewer looking in a particular direction as a function of time;
  
  generating video segments from the three-dimensional video based on scene boundaries in the three-dimensional video;
  
  for each of the video segments;
  
  determining a directional encoding format that projects latitudes and longitudes of locations of a surface of a sphere onto locations on a plane;
  
  determining a cost function that identifies a region of interest on the plane based on the probabilistic model; and
  
  generating optimal segment parameters that minimize a sum-over position for the region of interest;
  
  re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and to blur portions of each of the video segments based on the probability, wherein an intensity of a level of blur increases as the probability of the viewer looking in the particular direction decreases; and
  
  providing a re-encoded video and the optimal segment parameters for each of the video segments to a viewing device, wherein the viewing device uses the optimal segment parameters for each of the video segments to un-distort the re-encoded video and texture the re-encoded video to the sphere to display the re-encoded video with the region of interest for each of the video segments displayed at a higher resolution than other regions in each of the video segments.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the probabilistic model is a heat map.
  - 3. The method of claim 1, wherein the re-encoded video displays virtual reality content, augmented reality content, mixed reality content, or XR content.
  - 4. The method of claim 1, wherein re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring portions of each of the video segments occurs responsive to a threshold number of the people viewing the three-dimensional video.
  - 5. The method of claim 1, further comprising:
    - for each of the video segments, identifying a region of low interest;
      
      wherein re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring portions of each of the video segments is based on low interest; and
      
      wherein the region of low interest is displayed at a lower resolution than other regions in each of the video segments.
  - 6. The method of claim 1, further comprising:
    - providing the re-encoded video and the optimal segment parameters for each of the video segments to a client device, wherein the client device uses the re-encoded video and the optimal segment parameters for each of the video segments to generate a two-dimensional video that automates head movement.
  - 7. The method of claim 6, further comprising:
    - providing a user with an option to modify the two-dimensional video by at least one of selecting different optimal segment parameters and selecting a different region of interest for one or more of the video segments.
  - 8. The method of claim 1, further comprising:
    - cropping the region of interest for one or more video segments based on the optimal segment parameters to form one or more thumbnails of one or more cropped regions of interest; and
      
      generating a timeline of the three-dimensional video with the one or more thumbnails.
  - 9. The method of claim 1, wherein the optimal segment parameters indicate how biased the encoding is towards each high-resolution region.

10. A system comprises:
- one or more processors coupled to a memory;
  
  a head tracking module stored in the memory and executable by the one or more processors, the head tracking module operable to receive head-tracking data that describe one or more positions of people while the people are viewing a set of three-dimensional videos, generate a set of probabilistic models of the one or more positions of the people based on the head-tracking data, and estimate a first probabilistic model for a first three-dimensional video, wherein the first probabilistic model identifies a probability of a viewer looking in a particular direction as a function of time and the first three-dimensional video is not part of the set of three-dimensional videos;
  
  a segmentation module stored in the memory and executable by the one or more processors, the segmentation module operable to generate video segments from the three-dimensional video, wherein the video segments are of equal length and of a predetermined length;
  
  a parameterization module stored in the memory and executable by the one or more processors, the parameterization module operable to, for each of the video segments;
  
  determine a directional encoding format that projects latitudes and longitudes of locations of a surface of a sphere onto locations on a plane;
  
  determine a cost function that identifies a region of interest on the plane based on the first probabilistic model; and
  
  generate optimal segment parameters that minimize a sum-over position for the region of interest; and
  
  an encoder module stored in the memory and executable by the one or more processors, the encoder module operable;
  
  to re-encode the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring portions of each of the video segments based on the probability, wherein an intensity of a level of blur increases as the probability of the viewer looking in the particular direction decreases; and
  
  provide a re-encoded video and the optimal segment parameters for each of the video segments to a client device, wherein the client device uses the re-encoded video and the optimal segment parameters for each of the video segments to generate a two-dimensional video that automates head movement.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The system of claim 10, wherein the first probabilistic model is a heat map.
  - 12. The system of claim 10, wherein the encoder module is further operable to provide a recommendation of where to look by blurring every portion of the re-encoded video except for a location in the re-encoded video where a viewer is recommended to look.
  - 13. The system of claim 10, wherein re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring portions of each of the video segments occurs responsive to a threshold number of the people viewing the three-dimensional video.
  - 14. The system of claim 10, further comprising a user interface module stored in the memory and executable by the one or more processors, the user interface module operable to crop the region of interest for one or more video segments based on the optimal segment parameters to form one or more thumbnails of one or more cropped regions of interest and generate a timeline of the three-dimensional video with the one or more thumbnails.
  - 15. The system of claim 10, further comprising a user interface module stored in the memory and executable by the one or more processors, the user interface module operable to generate a two-dimensional video from the three-dimensional video based on the optimal segment parameters that depicts head tracking movement as automatic panning within the two-dimensional video.

16. A non-transitory computer storage medium encoded with a computer program, the computer program comprising instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising:
- receiving head-tracking data that describe one or more positions of people while the people are viewing a three-dimensional video;
  
  generating a probabilistic model of the one or more positions of the people based on the head-tracking data, wherein the probabilistic model identifies a probability of a viewer looking in a particular direction as a function of time;
  
  generating video segments from the three-dimensional video based on scene boundaries in the three-dimensional video;
  
  for each of the video segments;
  
  determining a directional encoding format that projects latitudes and longitudes of locations of a surface of a sphere onto locations on a plane;
  
  determining a cost function that identifies a region of interest on the plane based on the probabilistic model;
  
  generating optimal segment parameters that minimize a sum-over position for the region of interest; and
  
  identifying a region of low interest;
  
  re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring of the region of low interest; and
  
  providing a re-encoded video and the optimal segment parameters for each of the video segments to a viewing device, wherein the viewing device uses the optimal segment parameters for each of the video segments to un-distort the re-encoded video and texture the re-encoded video to the sphere to display the re-encoded video with the region of interest for each of the video segments displayed at a higher resolution than other regions in each of the video segments and the region of low interest displayed at a lower resolution than other regions in each of the video segments.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The computer storage medium of claim 16, wherein the probabilistic model is a heat map.
  - 18. The computer storage medium of claim 16, wherein:
    - an intensity of a level of blur increases as the probability of the viewer looking in the particular direction decreases; and
      
      the region of interest for each of the video segments is displayed at a higher resolution than other regions in each of the video segments.
  - 19. The computer storage medium of claim 18, wherein re-encoding the three-dimensional video to include the optimal segment parameters for each of the video segments and blurring portions of each of the video segments occurs responsive to a threshold number of the people viewing the three-dimensional video.
  - 20. The computer storage medium of claim 16, wherein the operations further comprise:
    - providing a re-encoded video and the optimal segment parameters for each of the video segments to a client device, wherein the client device uses the re-encoded video and the optimal segment parameters for each of the video segments to generate a two-dimensional video that automates head movement.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verizon Patent and Licensing Incorporated (Verizon Communications Inc.)
Original Assignee
Jaunt Inc. (Verizon Communications Inc.)
Inventors
Walkingshaw, Andrew, van Hoff, Arthur, Kopeinigg, Daniel
Primary Examiner(s)
Chio, Tat C

Application Number

US15/617,878
Publication Number

US 20170280166A1
Time in Patent Office

852 Days
Field of Search
US Class Current
CPC Class Codes

H04N 13/139   Format conversion, e.g. of ...

H04N 13/366   using viewer tracking

H04N 13/368   for two or more viewers

H04N 19/162   User input

H04N 19/167   Position within a video ima...

H04N 19/17   the unit being an image reg...

H04N 19/597   specially adapted for multi...

H04N 19/86   involving reduction of codi...

Probabilistic model to compress images for three-dimensional video

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Probabilistic model to compress images for three-dimensional video

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links