Method and apparatus for a region-based approach to coding a sequence of video images
First Claim
1. An encoder for encoding a sequence of video frames comprising:
- a segmentation unit which segments a current frame in said video sequence into a plurality of arbitrarily-shaped regions which may have different dimensions, each of said plurality of arbitrarily-shaped regions being assigned a motion vector;
a decoded frame memory for storing a previously decoded frame in said video sequence;
a prediction unit connected to said segmentation unit and said decoded frame memory for predicting image data of said current frame based upon a previously decoded frame and based upon the motion vector assigned to one of said plurality of arbitrarily-shaped regions;
a region shape coding unit for encoding the shape of each of said arbitrarily-shaped regions;
a mode decision unit which determines in which one of a plurality of modes image data from each of said plurality of arbitrarily-shaped regions is to be encoded, where said plurality of modes comprises an intra-frame mode in which the intensity of each pel in one of said plurality of arbitrarily-shaped regions is encoded;
a mode coding unit which encodes the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded;
a motion coding unit for encoding motion vectors associated with said plurality of arbitrarily-shaped regions;
a region interior coder which encodes the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said intra-frame mode;
a buffer which serves as an interface for transmitting encoded information from said encoder; and
a rate controller which receives signals from said buffer, where said rate controller sends control signals to said segmentation unit, said mode decision unit, and said region interior unit in response to the signals received from said buffer.
6 Assignments
0 Petitions
Accused Products
Abstract
An encoder segments frames in a sequence of digital images into multiple regions of arbitrary shape each of which has a corresponding motion vector relative to a previous decoded frame. A hierarchical multi-resolution motion estimation and segmentation technique, which segments the frame into multiple blocks and which assigns a best motion vector to each block is used. Blocks having the same or similar motion vector are then merged to form the arbitrarily-shaped regions. The shape of each region is coded, and a decision is made to code additional image data of each region in one of three modes. In a first inter-frame mode, a motion vector associated with a region is encoded. In a second inter-frame mode, a prediction error for the region is also encoded. In an intra-frame mode, the intensity of each picture element in the region is encoded. A region interior coder with frequency domain region-zeroing and space domain region-enforcing operations is employed for effectively coding the interior image data of the arbitrarily-shaped regions. The region interior coder uses an iterative technique based on the theory of successive projection onto convex sets (POCS) to find the best values for a group of selected transform coefficients. The coded information, including the shape of the region, the choice of the mode, and the motion vector and/or the region'"'"'s interior image data, may then be transmitted to a decoder where the image can be reconstructed.
231 Citations
49 Claims
-
1. An encoder for encoding a sequence of video frames comprising:
-
a segmentation unit which segments a current frame in said video sequence into a plurality of arbitrarily-shaped regions which may have different dimensions, each of said plurality of arbitrarily-shaped regions being assigned a motion vector; a decoded frame memory for storing a previously decoded frame in said video sequence; a prediction unit connected to said segmentation unit and said decoded frame memory for predicting image data of said current frame based upon a previously decoded frame and based upon the motion vector assigned to one of said plurality of arbitrarily-shaped regions; a region shape coding unit for encoding the shape of each of said arbitrarily-shaped regions; a mode decision unit which determines in which one of a plurality of modes image data from each of said plurality of arbitrarily-shaped regions is to be encoded, where said plurality of modes comprises an intra-frame mode in which the intensity of each pel in one of said plurality of arbitrarily-shaped regions is encoded; a mode coding unit which encodes the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded; a motion coding unit for encoding motion vectors associated with said plurality of arbitrarily-shaped regions; a region interior coder which encodes the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said intra-frame mode; a buffer which serves as an interface for transmitting encoded information from said encoder; and a rate controller which receives signals from said buffer, where said rate controller sends control signals to said segmentation unit, said mode decision unit, and said region interior unit in response to the signals received from said buffer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method of encoding a frame in a video sequence comprising the steps of:
-
(a) segmenting the frame into a plurality of arbitrarily-shaped regions which may have different dimensions, each having a corresponding motion vector; (b) encoding the shape of each arbitrarily-shaped region; (c) determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded, where said plurality of modes includes a first mode in which the motion vector corresponding to an arbitrarily-shaped region is encoded, a second mode in which the motion vector and a motion compensated prediction error associated with an arbitrarily-shaped region are encoded, and a third intra-frame mode in which the intensity of each pel in an arbitrarily-shaped region is encoded; (d) encoding the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded; (e) encoding the motion vector corresponding to one of said plurality of arbitrarily-shaped regions if the region is to be encoded in either said first mode or said second mode; (f) encoding a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said second mode; (g) encoding the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said third mode; and (h) storing information encoded in steps (b), (d), (e), (f) and (g). - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method of encoding a frame in a video sequence comprising the steps of:
-
(a) segmenting the frame into a plurality of arbitrarily-shaped regions which may have different dimensions, each having a corresponding motion vector; (b) encoding the shape of each arbitrarily-shaped region; (c) determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded, where said plurality of modes includes a first mode in which the motion vector corresponding to an arbitrarily-shaped region is encoded, a second mode in which the motion vector and a motion compensated prediction error associated with an arbitrarily-shaped region are encoded, and a third intra-frame mode in which the intensity of each pel in an arbitrarily-shaped region is encoded; (d) encoding the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded; (e) encoding the motion vector corresponding to one of said plurality of arbitrarily-shaped regions if the region is to be encoded in either said first mode or said second mode; (f) encoding a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said second mode; (g) encoding the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said third mode; and (h) transmitting information encoded in steps (b), (d), (e), (f) and (g) to a decoder. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
-
Specification