Method and System for Processing Multiview Videos for View Synthesis Using Skip and Direct Modes
First Claim
1. A method for processing a plurality of multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera, comprising the steps of:
- obtaining side information for synthesizing a particular view of multiview video;
synthesizing a synthesized multiview video from the plurality of multiview videos and the side information;
maintaining a reference picture list for each current frame of each of the plurality of multiview videos, the reference picture indexing temporal reference it pictures and spatial reference pictures of the plurality of acquired multiview videos and the synthesized reference pictures of the synthesized multiview video; and
predicting each current frame of the plurality of multiview videos according to reference pictures indexed by the associated reference picture list with an adaptive-reference skip mode or an adaptive-reference direct mode, wherein the adaptive-reference skip mode and adaptive-reference direct mode use one of the plurality of reference pictures.
1 Assignment
0 Petitions
Accused Products
Abstract
Multiview videos are acquired by overlapping cameras. Side information is used to synthesize multiview videos. A reference picture list is maintained for current frames of the multiview videos, the reference picture indexes temporal reference pictures and spatial reference pictures of the acquired multiview videos and the synthesized reference pictures of the synthesized multiview video. Each current frame of the multiview videos is predicted according to reference pictures indexed by the associated reference picture list with a skip mode and a direct mode, whereby the side information is inferred from the synthesized reference picture. Alternatively, the depth images corresponding to the multiview videos of the input data, and this data are encoded as part of the bitstream depending on a SKIP type.
-
Citations
24 Claims
-
1. A method for processing a plurality of multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera, comprising the steps of:
-
obtaining side information for synthesizing a particular view of multiview video; synthesizing a synthesized multiview video from the plurality of multiview videos and the side information; maintaining a reference picture list for each current frame of each of the plurality of multiview videos, the reference picture indexing temporal reference it pictures and spatial reference pictures of the plurality of acquired multiview videos and the synthesized reference pictures of the synthesized multiview video; and predicting each current frame of the plurality of multiview videos according to reference pictures indexed by the associated reference picture list with an adaptive-reference skip mode or an adaptive-reference direct mode, wherein the adaptive-reference skip mode and adaptive-reference direct mode use one of the plurality of reference pictures. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A system for processing a plurality of multiview videos of a scene, comprising:
-
a plurality of cameras, each camera configured to acquire a multiview video of a scene, each camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera; means for obtaining side information for synthesizing a particular view of the multiview video; means for synthesizing a synthesized multiview video from the plurality of multiview videos and the side information; a memory buffer configured to maintain a reference picture list for each current frame of each of the plurality of multiview videos, the reference picture indexing temporal reference pictures and spatial reference pictures of the plurality of acquired multiview videos and the synthesized reference pictures of the synthesized multiview video; and means for predicting each current frame of the plurality of multiview videos according to reference pictures indexed by the associated reference picture list with an adaptive-reference skip mode or an adaptive-reference direct mode, wherein the adaptive-reference skip mode and adaptive-reference direct mode use one of the plurality of reference pictures.
-
Specification