Video compression system using a dense motion vector field and a triangular patch mesh overlay model
First Claim
1. A data compression system in a sending unit for transmission at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) wherein said sequence includes a temporally first image frame and at least a temporally second image frame, and frames having scene contents containing spatial gradients and also having regions displaced corresponding to motion of the scene content, the data compression system comprising:
- means for computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporally second image frame with respect to said temporally first image frame;
means for generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences;
means for assigning a designation to each one of said triangular patches in accordance with a patch-assigning constraint and to each one of the node points associated therewith;
means for estimating and thereby parameterizing the motion of each one of said node points of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said node point motion being estimated in accordance with motion-estimating constraints by an application of a statistical solution to an affine transformation of said first image frame, whereby said dense motion vectors are data-compressed into estimated node point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame;
means for identifying failure regions (MF and UB) in said data-compressed estimated second image frame wherein a difference is computed between the pixel signal levels of the uncompressed second image frame and the data-compressed estimated second image frame in accordance with certain ones of failure-identifying constraints, whereby each failure region is defined as such rectangular coordinate array of pixels of said uncompressed second image frame which encloses the pixels within a failure region identified in said temporally second image frame;
means for establishing a bit budget for the data to be transmitted at said constant data bit rate by said sending unit to said receiving unit;
means for revising the failure regions defined by said failure regions identifying means in accordance with a revised failure-identifying constraint;
means for assigning a designation to each one of said failure regions and alternatively to each one of said revised failure regions;
means for coding the node designations of each one of said triangular patches of said mesh overlay model;
means for coding the pixel signal levels of pixels within each one of said rectangular coordinate array failure regions;
means for coding the node point motion vectors of said data-compressed estimated second image frame;
means for coding the pixel signal levels of all pixels of said temporally first image frame when said first frame is an original or new first frame; and
;
means for transmitting at a constant data bit rate from said sending unit over said data link to said receiving unit said coded pixel signal levels of said first image frame, said coded node point motion vectors, said coded pixel signal levels corresponding to said failure regions, and said coded node designations in accordance with said communication standard.
1 Assignment
0 Petitions
Accused Products
Abstract
In a temporal sequence of digitized image frames of video signals, the spatial and temporal image gradients and the pixel-to-pixel motion vectors (dense motion vectors) are obtained between two consecutive image frames. A shape-adaptive triangular patch mesh model overlay is provided on the first image frame such that the location of node points of each patch is determined by the spatial image gradients of the first frame and the pixel-to-pixel motion vectors. A priority ranking of patches is established before determining the node point motion vectors. The node point motion vectors, representing the motion of each of the node points of the triangular mesh patches, are estimated by a linear least-squares solution to an affine transformation of the mesh overlay on the first frame into the second frame. Failure regions are identified and are revised in accordance with a data bit budget. All data are coded prior to transmission at a constant data bit rate by a sending unit to a receiving unit over a data link.
169 Citations
90 Claims
-
1. A data compression system in a sending unit for transmission at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) wherein said sequence includes a temporally first image frame and at least a temporally second image frame, and frames having scene contents containing spatial gradients and also having regions displaced corresponding to motion of the scene content, the data compression system comprising:
-
means for computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporally second image frame with respect to said temporally first image frame; means for generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences; means for assigning a designation to each one of said triangular patches in accordance with a patch-assigning constraint and to each one of the node points associated therewith; means for estimating and thereby parameterizing the motion of each one of said node points of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said node point motion being estimated in accordance with motion-estimating constraints by an application of a statistical solution to an affine transformation of said first image frame, whereby said dense motion vectors are data-compressed into estimated node point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame; means for identifying failure regions (MF and UB) in said data-compressed estimated second image frame wherein a difference is computed between the pixel signal levels of the uncompressed second image frame and the data-compressed estimated second image frame in accordance with certain ones of failure-identifying constraints, whereby each failure region is defined as such rectangular coordinate array of pixels of said uncompressed second image frame which encloses the pixels within a failure region identified in said temporally second image frame; means for establishing a bit budget for the data to be transmitted at said constant data bit rate by said sending unit to said receiving unit; means for revising the failure regions defined by said failure regions identifying means in accordance with a revised failure-identifying constraint; means for assigning a designation to each one of said failure regions and alternatively to each one of said revised failure regions; means for coding the node designations of each one of said triangular patches of said mesh overlay model; means for coding the pixel signal levels of pixels within each one of said rectangular coordinate array failure regions; means for coding the node point motion vectors of said data-compressed estimated second image frame; means for coding the pixel signal levels of all pixels of said temporally first image frame when said first frame is an original or new first frame; and
;means for transmitting at a constant data bit rate from said sending unit over said data link to said receiving unit said coded pixel signal levels of said first image frame, said coded node point motion vectors, said coded pixel signal levels corresponding to said failure regions, and said coded node designations in accordance with said communication standard. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 11, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 33, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 61, 62, 63, 64, 66, 68, 69, 70, 71, 72, 73, 74, 75, 76, 78, 79)
-
-
10. α
-
1. α
2 are positive scalars selected such that ##EQU10## are of the same order,
space="preserve" listing-type="equation">D.sub.kis the displaced region difference (DRD) within a patch k,
space="preserve" listing-type="equation">N.sub.kis the total number of pixels within a patch k,
space="preserve" listing-type="equation">σ
.sub.k.sup.2is the variance among the pixel signal levels within a patch k wherein patches having the highest variance are assigned the highest rank and correspondingly highest priority for processing by said motion estimating means in accordance with at least one of said patch-assigning constraints, and said means for rank ordering using a function of
space="preserve" listing-type="equation">D.sub.k,Nk, and
space="preserve" listing-type="equation">σ
.sub.k.sup.2.10. - View Dependent Claims (12, 13)
-
1. α
-
30. A method of data compression in a sending unit for transmitting at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) having associated pixel signal levels corresponding to a scene content containing spatial gradients and wherein said sequence includes a temporally first image frame and at least a temporally second image frame having at least one pixel region displaced in correspondence with motion of said scene content with respect to an identical pixel region of said first image frame, the data compression method comprising the steps of:
-
storing temporarily in a storing means the pixel signal levels of said first image frame; determining the values and the coordinate locations of said spatial gradients on a pixel-by-pixel basis within said first and second image frames; computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of aid temporally second image frame with respect to said temporally first image frame; generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints in response to said spatial gradients and displaced region differences; assigning a designation to each one of said shape-adapted triangular patches in accordance with the patch-assigning constraint and to each one of the node points associated therewith; estimating and thereby parameterizing the motion of each one of said node points of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said note point motion being estimated in accordance with motion-estimating constraints by an application of a statistical solution to an affine transformation of said first image format, whereby said dense motion vectors are data-compressed into estimated note point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame; identifying failure regions in said data-compressed estimated second image frame wherein a difference is computed between the pixel signal levels of a temporarily stored uncompressed second image frame and said data-compressed estimated second image frame in accordance with a failure-identifying constraint, whereby each failure region is defined as such rectangular coordinate array of pixels of the uncompressed second image frame which encloses the pixels within a failure region identified in said temporally second image frame; establishing a bit budget for the data to be transmitted at said constant data bit rate by said sending unit to said receiving unit; revising the failure regions defined in said failure regions identifying step in accordance with a revised failure-identifying constraint; assigning a designation to each one of said failure regions and alternatively to each one of said revised failure regions; coding the node designations of each one of said triangular patches of said mesh overlay model; coding the pixel signal levels of pixels within each one of the rectangular coordinate array failure revised failure regions; coding the node point motion vectors of said data-compressed estimated second image frame; coding the pixel signal levels of all pixels of said temporally first image frame when said first frame is an original or a new first frame; and transmitting at a constant data bit rate from said sending unit over said data link to said receiving unit said coded pixel signal levels of said first image frame, said coded node point motion vectors, the coded pixel signal levels corresponding to said failure regions and said coded node designations in accordance with said communication standard. - View Dependent Claims (32, 35)
-
-
34. α
-
1. α
2 are positive scalars selected such that ##EQU16## are of the same order,
space="preserve" listing-type="equation">D.sub.kis the displaced region difference (DRD) within a patch k,
space="preserve" listing-type="equation">N.sub.kis the total number of pixels within a patch k,
space="preserve" listing-type="equation">σ
.sub.k.sup.2is the variance among the pixel signal levels within a patch k wherein patches having the highest variance are assigned the highest rank and correspondingly highest priority for processing by said motion estimating means in accordance with the patch-assigning constraint; and rank ordering by a function of
space="preserve" listing-type="equation">D.sub.k,Nk, and
space="preserve" listing-type="equation">σ
.sub.k.sup.2.
-
1. α
-
60. In a video data compression system in a sending unit for transmitting at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) having associated pixel signal levels representative of spatial image gradients corresponding to a scene content and wherein said sequence includes a temporally first image frame and at least a temporally second image frame having at least one pixel region displaced in correspondence with motion of said scene content with respect to an identical pixel region of said first image frame, the data compression system comprising:
-
means for temporarily storing the pixel signal levels of said first image frame; means for determining the values and the coordinate locations of said spatial image gradients on a pixel-by-pixel basis within said first and second image frames; means for computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporally second image frame with respect to said temporally first image frame; and means for generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences.
-
-
65. In a video data compression system in a sending unit for transmission at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) wherein said sequence includes a temporally first image frame and at least a temporally second image frame, the frames having scene contents containing spatial gradients and also having regions displaced corresponding to motion of the scene content, the data compression system comprising:
-
means for computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporally second image frame with respect to said temporally first image frame; means for generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences; means for assigning a designation to each one of said shape-adaptive triangular patches in accordance with a patch-assigning constraint and to each one of the node points associated therewith; means for estimating and thereby parameterizing the motion of each one of said node points of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said node point motion being estimated in accordance with motion-estimating constraints by an application of a statistical solution to an affine transformation of said first image frame, whereby said dense motion vectors are data-compressed into estimated node point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame; means for identifying failure regions (MF and UB) in said data-compressed estimated second image frame wherein a difference is computed between the pixel signal levels of the uncompressed second image frame and said data-compressed estimated second image frame in accordance with failure-identifying constraints, whereby each failure region is defined as such rectangular coordinate array of pixels of the uncompressed second image frame which encloses the pixels within a failure region identified in said temporally second image frame; and means for establishing a bit budget for the data to be transmitted at said constant data bit rate by said sending unit to said receiving unit.
-
-
67. α
-
1. α
2 are positive scalars selected such that ##EQU21## are of the same order, Dk is the displaced region difference (DRD) within a patch k,Nk is the total number of pixels within a patch k, σ
k2 is the variance among the pixel signal levels within a patch k,wherein patches having the highest variance are assigned the highest rank and correspondingly highest priority for processing by said motion estimating means in accordance with at least one of said patch-assigning constraints, and said means for rank ordering using a function of Dk, Nk, and σ
k2.
-
1. α
-
77. In a video data compression system in a sending unit for transmission at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) wherein said sequence includes a temporally first image frame and at least a temporally second image frame, the frames having scene contents containing spatial gradients and also having regions displaced corresponding to motion of the scene content, the data compression system comprising:
-
means for computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporarlly second image frame with respect to said temporally first image frame; means for generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences and wherein said plurality of interconnected patches are arranged to form a common node point among one of the node points of each of the patches with remaining node points of each of the patches having node point coordinates defining a polygonal boundary; means for assigning a designation to each one of said shape-adaptive triangular patches in accordance with a patch-assigning constraint and to each one of the node points associated therewith; and means for estimating and thereby parameterizing the motion of each one of said polygonal boundary node points and of said common node point of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said node point motion being estimated in accordance with a motion-estimating constraint by an application of statistical solution to an affine transformation of said first image frame, whereby said dense motion vectors are data-compressed into estimated node point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame. - View Dependent Claims (81, 82, 83, 85, 87, 88, 89, 90)
-
-
80. In a video data compression method in a sending unit for transmission at a constant data bit rate over a data link in accordance with a communication standard between said sending unit and a receiving unit of data of a temporal sequence of image frames of video signals in which each image frame is represented prior to data compression by a two-dimensional coordinate array of a number of digitized picture elements (pixels) wherein said sequence includes a temporally first image frame and at least a temporally second image frame, the frames having scene contents containing spatial gradients and also having regions displaced corresponding to motion of the scene content, the data compression method comprising the steps of:
-
computing a dense motion field of dense motion vectors on a pixel-by-pixel basis representing the values and directions of motion of said scene content of said temporally second image frame with respect to said temporally first image frame; generating a two-dimensional model of a mesh overlay on said first image frame, said mesh model comprising a plurality of interconnected shape-adaptive triangular patches with each patch having node points with respectively associated node point coordinates, each of said patches being defined on said first image frame in accordance with mesh-generating constraints and spatial gradients and displaced frame differences, and wherein said plurality of interconnected patches are arranged to form a common node point among one of the node points of each of the patches with remaining node points of each of the patches having node point coordinates defining a polygonal boundary; assigning a designation to each one of said shape-adaptive triangular patches in accordance with a patch-assigning constraint and to each one of the node points associated therewith; estimating and thereby parameterizing the motion of each one of said polygonal boundary node points and of said common node points of said triangular patches overlaid on said first image frame in response to said dense motion vectors, said node point motion being estimated in accordance with motion-estimating constraints by an application of a statistical solution to an affine transformation of the first image frame, whereby said dense motion vectors are data-compressed into estimated node point motion vectors having second coordinates associated therewith and wherein all estimated node point motion vectors together comprise a representation of a data-compressed estimated second image frame; identifying failure regions (MF and UB) in said data-compressed estimated second image frame wherein a difference is computed between the pixel signal levels of a temporarily stored uncompressed second image frame and said data-compressed estimated second image frame in accordance with a failure-identifying constraint, whereby each failure region is defined as such rectangular coordinate array of pixels of the uncompressed second image frame which encloses the pixels within a failure region identified in said temporally second image frame; and establishing a bit budget for the data to be transmitted at said constant data bit rate by said sending unit to said receiving unit. - View Dependent Claims (84)
-
-
86. α
-
1. α
2 are positive scalars selected such that ##EQU28## are of the same order Dk is the displaced frame difference (DFD) within a patch k,Nk is the total number of pixels within a patch k, σ
k2 is the variance among the pixel signal levels within a patch kwherein patches having the highest variance are assigned the highest rank and correspondingly highest priority for processing by said motion estimating step in accordance with one of said patch-assigning constraints; and rank ordering by a function of Dk, Nk, and σ
k2.
-
1. α
Specification