Method and system for processing multiview videos for view synthesis using skip and direct modes
First Claim
1. A method for processing a plurality of multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera, comprising the steps of:
- obtaining side information for synthesizing a particular view of the multiview videos;
synthesizing a synthesized reference picture from at least one input video selected from the plurality of multiview videos and the side information, wherein the synthesized reference picture corresponds to a single pose different than the input video;
maintaining a reference picture list for each current frame of each of the plurality of multiview videos, wherein the reference picture list indexes temporal reference pictures and spatial reference pictures of the plurality of multiview videos and the synthesized reference picture and wherein the temporal reference pictures are associated with different time instants and the spatial reference pictures are associated with a same time instant; and
predicting each current frame corresponding to the single pose of the synthesized reference picture according to reference pictures indexed by the associated reference picture list, wherein the predicting uses a synthetic skip mode and a synthetic direct mode based on the synthesized reference picture, and wherein the side information is inferred from an earliest synthesized reference picture in the reference picture list.
1 Assignment
0 Petitions
Accused Products
Abstract
A method processes a multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera. Side information for synthesizing a particular view of the multiview video is obtained in either an encoder or decoder. A synthesized multiview video is synthesized from the multiview videos and the side information. A reference picture list is maintained for each current frame of each of the multiview videos, the reference picture indexes temporal reference pictures and spatial reference pictures of the acquired multiview videos and the synthesized reference pictures of the synthesized multiview video. Each current frame of the multiview videos is predicted according to reference pictures indexed by the associated reference picture list with a skip mode and a direct mode, whereby the side information is inferred from the synthesized reference picture.
122 Citations
20 Claims
-
1. A method for processing a plurality of multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera, comprising the steps of:
-
obtaining side information for synthesizing a particular view of the multiview videos; synthesizing a synthesized reference picture from at least one input video selected from the plurality of multiview videos and the side information, wherein the synthesized reference picture corresponds to a single pose different than the input video; maintaining a reference picture list for each current frame of each of the plurality of multiview videos, wherein the reference picture list indexes temporal reference pictures and spatial reference pictures of the plurality of multiview videos and the synthesized reference picture and wherein the temporal reference pictures are associated with different time instants and the spatial reference pictures are associated with a same time instant; and predicting each current frame corresponding to the single pose of the synthesized reference picture according to reference pictures indexed by the associated reference picture list, wherein the predicting uses a synthetic skip mode and a synthetic direct mode based on the synthesized reference picture, and wherein the side information is inferred from an earliest synthesized reference picture in the reference picture list. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A system for processing a plurality of multiview videos of a scene, comprising:
-
a plurality of cameras, each camera configured to acquire a multiview video of a scene, each camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera; means for obtaining side information for synthesizing a particular view of the multiview videos; means for synthesizing a synthesized reference picture from at least one input video selected from the plurality of multiview videos and the side information, wherein the synthesized reference picture corresponds to a single pose different than the input video; a memory buffer configured to maintain a reference picture list for each current frame of each of the plurality of multiview videos, wherein the reference picture list indexes temporal reference pictures and spatial reference pictures of the plurality of multiview videos and the synthesized reference picture; and means for predicting each current frame corresponding to the single pose of the synthesized reference picture according to reference pictures indexed by the associated reference picture list, wherein the predicting uses a synthetic skip mode and a synthetic direct mode, and wherein the side information is inferred from an earliest synthesized reference picture in the reference picture list.
-
Specification