Semantics-based motion estimation for multi-view video coding
First Claim
1. A difference vector estimation method comprising:
- identifying, by a computer, one or more pixels in a first frame of a multi-view video sequence;
constraining a search range associated with a second frame of said multi-view video sequence to a first area vertically centered on an epipolar line in the second frame, wherein said epipolar line corresponds to said one or more pixels in the first frame, the first area is defined by a having a vertical height specified by a first correlation between efficient compression and semantic accuracy received by the computer from a user, wherein said vertical height increases if the first correlation is weighted toward efficient compression and said vertical height decreases if the first correlation is weighted toward semantic accuracy, wherein semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence, wherein the vertical direction is defined as the direction perpendicular to said epipolar line, and wherein said search range is further constrained using a disparity vector computed for said one or more pixels in the first frame and wherein the constrained search range is repositioned relative to said epipolar line using said disparity vector in addition to constraining said vertical height using the first correlation;
searching the second frame within said constrained search range for a match of said one or more pixels identified in the first frame for subsequent use in computing a difference vector for said one or more pixels in the first frame, said difference vector to be transmitted as part of a compressed representation of the first frame;
receiving a second correlation between efficient compression and semantic accuracy from the user; and
searching a third frame within a search range constrained by a second correlation between efficient compression and semantic accuracy, the second correlation specified by the user and a value of the second correlation is different from a value of the first correlation.
2 Assignments
0 Petitions
Accused Products
Abstract
A motion estimation method and apparatus for video coding of a multi-view sequence is described. In one embodiment, a motion estimation method includes identifying one or more pixels in a first frame of a multi-view video sequence, and constraining a search range associated with a second frame of the multi-view video sequence based on an indication of a desired correlation between efficient coding and semantic accuracy. The semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence. The method further includes searching the second frame within the constrained search range for a match of the pixels identified in the first frame.
20 Citations
27 Claims
-
1. A difference vector estimation method comprising:
-
identifying, by a computer, one or more pixels in a first frame of a multi-view video sequence; constraining a search range associated with a second frame of said multi-view video sequence to a first area vertically centered on an epipolar line in the second frame, wherein said epipolar line corresponds to said one or more pixels in the first frame, the first area is defined by a having a vertical height specified by a first correlation between efficient compression and semantic accuracy received by the computer from a user, wherein said vertical height increases if the first correlation is weighted toward efficient compression and said vertical height decreases if the first correlation is weighted toward semantic accuracy, wherein semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence, wherein the vertical direction is defined as the direction perpendicular to said epipolar line, and wherein said search range is further constrained using a disparity vector computed for said one or more pixels in the first frame and wherein the constrained search range is repositioned relative to said epipolar line using said disparity vector in addition to constraining said vertical height using the first correlation; searching the second frame within said constrained search range for a match of said one or more pixels identified in the first frame for subsequent use in computing a difference vector for said one or more pixels in the first frame, said difference vector to be transmitted as part of a compressed representation of the first frame; receiving a second correlation between efficient compression and semantic accuracy from the user; and searching a third frame within a search range constrained by a second correlation between efficient compression and semantic accuracy, the second correlation specified by the user and a value of the second correlation is different from a value of the first correlation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer readable memory medium that provides computer program instructions, which when executed on a computer processor cause the processor to perform operations comprising:
-
identifying one or more pixels in a first frame of a multi-view video sequence; constraining a search range associated with a second frame of the multi-view video sequence to a first area vertically centered on an epipolar line in the second frame, wherein said epipolar line corresponds to the one or more pixels in the first frame, the first area is defined by having a vertical height specified by a first correlation between efficient compression and semantic accuracy received from a user, wherein said vertical height increases if the first correlation is weighted toward efficient compression and said vertical height decreases if the first correlation is weighted toward semantic accuracy, wherein semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence, wherein the vertical dimension is defined as the direction perpendicular to said epipolar line and wherein said search range is further constrained using a disparity vector computed for said one or more pixels of the first frame and wherein said constrained search range is repositioned relative to the said epipolar line using said disparity vector in addition to constraining said vertical height using the first correlation; searching the second frame within said constrained search range for a match of said one or more pixels identified in the first frame for subsequent use in computing a difference vector for the one or more pixels, said difference vector to be transmitted as part of a compressed representation of the first frame; receiving a second correlation between efficient compression and semantic accuracy from the user; and searching a third frame within a search range constrained by a second correlation between efficient compression and semantic accuracy, the second correlation specified by the user and a value of the second correlation is different from a value of the first correlation. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A computerized system comprising:
-
a memory; and at least one processor coupled to the memory, the at least one processor executing a set of instructions which cause the at least one processor to identify one or more pixels in a first frame of a multi-view video sequence, constrain a search range associated with a second frame of the multi-view video sequence to a first area vertically centered on an epipolar line in the second frame, wherein said epipolar line corresponds to said one or more pixels in the first frame, the first area is defined by having a vertical height specified by a first correlation between efficient compression and semantic accuracy received from a user, wherein said vertical height increases if the first correlation is weighted toward efficient compression and said vertical height decreases if the first correlation is weighted toward semantic accuracy, wherein semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence, wherein the vertical dimension is defined as the direction perpendicular to said epipolar line, and wherein said search range is further constrained using a disparity vector computed for said one or more pixels in the first frame and wherein said constrained search range is repositioned relative to said epipolar line using said disparity vector in addition to constraining said vertical height using the first correlation, search the second frame within said constrained search range for a match of said one or more pixels identified in the first frame for subsequent use in computing a difference vector for said one or more pixels in the first frame, said difference vector to be transmitted as part of a compressed representation of the first frame, receive a second correlation between efficient and semantic accuracy from the user, and search a third frame within a search range constrained by a second correlation between efficient compression and semantic accuracy, the second correlation specified by the user and the second correlation different from the first correlation. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A difference vector estimation apparatus comprising:
-
a block identifier to identify one or more pixels in a first frame of a multi-view video sequence; a search range determinator to constrain a search range associated with a second frame of the multi-view video sequence to a first area vertically centered on an epipolar line in the second frame, wherein said epipolar line corresponds to said one or more pixels in the first frame, the first area is defined by having a vertical height specified by a first correlation between efficient compression and semantic accuracy received from a user, wherein said vertical height increases if the first correlation is weighted toward efficient compression and said vertical height decreases if the first correlation is weighted toward semantic accuracy, wherein semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence, wherein the vertical direction is defined as the direction perpendicular to the epipolar line wherein said search range determinator is configured to further constrain the search range using a disparity vector computed for said one or more pixels in the first frame and wherein said constrained search range is repositioned relative said epipolar line using said disparity vector in addition to constraining said vertical height using the first correlation; and a searcher to search the second image within said constrained search range for a match of said one or more pixels identified in the first frame for use by a difference vector calculator to compute a difference vector for the one or more pixels, said difference vector to be transmitted as part of a compressed representation of the first frame, and to search a third image within a search range constrained by a second correlation between efficient compression and semantic accuracy, the second correlation received from the user and different from the first correlation. - View Dependent Claims (23, 24, 25, 26, 27)
-
Specification