Method and apparatus for a region-based approach to coding a sequence of video images

US 5,608,458 A
Filed: 10/13/1994
Issued: 03/04/1997
Est. Priority Date: 10/13/1994
Status: Expired due to Term

First Claim

Patent Images

1. An encoder for encoding a sequence of video frames comprising:

a segmentation unit which segments a current frame in said video sequence into a plurality of arbitrarily-shaped regions which may have different dimensions, each of said plurality of arbitrarily-shaped regions being assigned a motion vector;

a decoded frame memory for storing a previously decoded frame in said video sequence;

a prediction unit connected to said segmentation unit and said decoded frame memory for predicting image data of said current frame based upon a previously decoded frame and based upon the motion vector assigned to one of said plurality of arbitrarily-shaped regions;

a region shape coding unit for encoding the shape of each of said arbitrarily-shaped regions;

a mode decision unit which determines in which one of a plurality of modes image data from each of said plurality of arbitrarily-shaped regions is to be encoded, where said plurality of modes comprises an intra-frame mode in which the intensity of each pel in one of said plurality of arbitrarily-shaped regions is encoded;

a mode coding unit which encodes the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded;

a motion coding unit for encoding motion vectors associated with said plurality of arbitrarily-shaped regions;

a region interior coder which encodes the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said intra-frame mode;

a buffer which serves as an interface for transmitting encoded information from said encoder; and

a rate controller which receives signals from said buffer, where said rate controller sends control signals to said segmentation unit, said mode decision unit, and said region interior unit in response to the signals received from said buffer.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An encoder segments frames in a sequence of digital images into multiple regions of arbitrary shape each of which has a corresponding motion vector relative to a previous decoded frame. A hierarchical multi-resolution motion estimation and segmentation technique, which segments the frame into multiple blocks and which assigns a best motion vector to each block is used. Blocks having the same or similar motion vector are then merged to form the arbitrarily-shaped regions. The shape of each region is coded, and a decision is made to code additional image data of each region in one of three modes. In a first inter-frame mode, a motion vector associated with a region is encoded. In a second inter-frame mode, a prediction error for the region is also encoded. In an intra-frame mode, the intensity of each picture element in the region is encoded. A region interior coder with frequency domain region-zeroing and space domain region-enforcing operations is employed for effectively coding the interior image data of the arbitrarily-shaped regions. The region interior coder uses an iterative technique based on the theory of successive projection onto convex sets (POCS) to find the best values for a group of selected transform coefficients. The coded information, including the shape of the region, the choice of the mode, and the motion vector and/or the region'"'"'s interior image data, may then be transmitted to a decoder where the image can be reconstructed.

231 Citations

49 Claims

1. An encoder for encoding a sequence of video frames comprising:
- a segmentation unit which segments a current frame in said video sequence into a plurality of arbitrarily-shaped regions which may have different dimensions, each of said plurality of arbitrarily-shaped regions being assigned a motion vector;
  
  a decoded frame memory for storing a previously decoded frame in said video sequence;
  
  a prediction unit connected to said segmentation unit and said decoded frame memory for predicting image data of said current frame based upon a previously decoded frame and based upon the motion vector assigned to one of said plurality of arbitrarily-shaped regions;
  
  a region shape coding unit for encoding the shape of each of said arbitrarily-shaped regions;
  
  a mode decision unit which determines in which one of a plurality of modes image data from each of said plurality of arbitrarily-shaped regions is to be encoded, where said plurality of modes comprises an intra-frame mode in which the intensity of each pel in one of said plurality of arbitrarily-shaped regions is encoded;
  
  a mode coding unit which encodes the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded;
  
  a motion coding unit for encoding motion vectors associated with said plurality of arbitrarily-shaped regions;
  
  a region interior coder which encodes the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said intra-frame mode;
  
  a buffer which serves as an interface for transmitting encoded information from said encoder; and
  
  a rate controller which receives signals from said buffer, where said rate controller sends control signals to said segmentation unit, said mode decision unit, and said region interior unit in response to the signals received from said buffer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The encoder of claim 1 whereinsaid plurality of modes further comprises a first inter-frame mode in which a motion vector associated with one of said plurality of arbitrarily-shaped regions is encoded and a second inter-frame mode in which a motion vector and a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions are encoded;
    - andsaid region interior coder encodes a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said second inter-frame mode.
  - 3. The encoder of claim 1 wherein said segmentation unit comprises a joint motion estimation and segmentation unit for performing the following functions:
    - (a) dividing said current frame into a plurality of smaller regions of predetermined shape and size;
      
      (b) performing for each of said smaller regions a motion vector updating routine by assigning to the smaller region a best motion vector selected from among an initial motion vector assigned to the smaller region, motion vectors of neighboring regions, and an updated motion vector obtained by performing a block matching technique for the smaller region, wherein the best motion vector is selected according to a priority scheme and a predetermined threshold value;
      
      (c) dividing each of said smaller regions into a plurality of smaller regions of predetermined shape and size;
      
      (d) repeating step (b) for each of the smaller regions resulting from step (c);
      
      (e) iteratively repeating steps (c) and (d) for each smaller region resulting from step (d) until a stop condition is reached; and
      
      (f) merging adjacent regions having similar motion vectors to form said arbitrarily-shaped regions.
  - 4. The encoder of claim 3 wherein said joint motion estimation and segmentation unit comprises a best motion vector selection unit which performs the following functions:
    - determining the smallest matching error from among the matching errors obtained respectively by assigning to the smaller region the following motion vectors;
      
      (a) the initial motion vector assigned to the smaller region;
      
      (b) the updated motion vector obtained by performing a block matching technique for the smaller region; and
      
      (c) the motion vectors of the smaller region'"'"'s neighboring regions;
      
      selecting the initial motion vector as the best motion vector if the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is less than the predetermined threshold value;
      
      selecting the motion vector of one of the neighboring regions as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and the matching error obtained by assigning to the smaller region the motion vector of the neighboring region is less than the predetermined threshold value; and
      
      selecting the matched motion vector as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and each of the matching errors obtained by assigning to the smaller region the motion vector of one of the neighboring region is not less than the predetermined threshold value.
  - 5. The encoder of claim 3 wherein the step of dividing each of said smaller regions into a plurality of smaller regions comprises the step of dividing each of said smaller regions into four smaller square blocks of equal size.
  - 6. The encoder of claim 5 wherein said region interior coder uses an iterative technique with frequency domain region-zeroing and space domain region-enforcing operations to transform an arbitrarily-shaped image into optimal transform coefficients (OTC).
  - 7. The encoder of claim 6 wherein said region interior coder comprises:
    - an image circumscription unit for receiving image data of an arbitrarily-shaped region, and circumscribing a rectangular block around said arbitrarily-shaped region, thereby defining an original internal pel set and an original external pel set;
      
      an extrapolator which extrapolates pel values of said internal pel set to initialize pel values of said external pel set;
      
      a forward transform which transforms said image to transform coefficients;
      
      a TCS generator which generates a transform coefficient set (TCS) from said transform coefficients, said TCS generator outputs said TCS when said TCS represents said OTC, and sends said TCS to an inverse transform when said TCS does not represent said OTC;
      
      an inverse transform which transforms said TCS to a computed region block having computed pel values; and
      
      a replacer which replaces those computed pel values corresponding to said interior pel set with said original pel values to form a modified computed region block (MCRB), said replacer sends the modified computed region block to the forward transform for re-iteration.
  - 8. The encoder of claim 7 wherein said TCS generator generates said TCS by selecting an retaining those transform coefficients which have high energy according to the energy compaction property of transform coefficients, and by zeroing all the non-selected transform coefficients.
  - 9. The encoder of claim 8 further comprising a frame skip unit which receives said sequence of frames and which determines whether each frame in said sequence should be skipped.
  - 10. The encoder of claim 9 further comprising a multiplexer for passing encoded information from said region shape coding unit, said mode coding unit, said motion coding unit, and said region interior coder to said buffer in a predefined order.
  - 11. The encoder of claim 2 wherein said mode decision unit determines in which one of said plurality of modes to encode image data of a particular one of said arbitrarily-shaped regions based upon the values of the following normalized sums of absolute differences for the particular region:
    - ##EQU2## where N is the total number of pels in the particular region, i is a given pel in the region, R is the set of all pels in the particular region, I_i is the intensity of the given pel i, m is the mean value of the intensity of all the pels in the particular region, and e_i designates the motion compensated prediction error associated with the given pel i.
  - 12. The encoder of claim 3 wherein said segmentation unit further comprises:
    - an intensity segmentation unit which divides a frame into a plurality of arbitrarily-shaped intensity regions by grouping together pels that have a similar intensity features;
      
      a region based matching unit for determining a motion vector indicating the relative difference in position between one of said plurality of intensity regions and a matched region in a previously decoded frame;
      
      a switch for sending a received frame to said intensity segmentation unit in response to a first control signal and to said joint motion estimation and segmentation unit in response to a second control signal; and
      
      a plurality of switches that allow image data information to pass from said intensity segmentation unit and said joint motion estimation and segmentation unit to other of said units in said encoder in response to control signals synchronized with said first and second control signals, respectively.
  - 13. The encoder of claim 1 wherein said region interior coder uses an iterative technique with frequency domain region-zeroing and space domain region-enforcing operations to transform an arbitrarily-shaped image into optimal transform coefficients (OTC).
  - 14. The encoder of claim 13 wherein said region interior coder comprises:
    - an image circumscription unit for receiving image data of an arbitrarily-shaped region, and circumscribing a rectangular block around said arbitrarily-shaped region, thereby defining an original internal pel set and an original external pel set;
      
      an extrapolator which extrapolates pel values of said internal pel set to initialize pel values of said external pel set;
      
      a forward transform which transforms said image to transform coefficients;
      
      a TCS generator which generates a transform coefficient set (TCS) from said transform coefficients, said TCS generator outputs said TCS when said TCS represents said OTC, and sends said TCS to an inverse transform when said TCS does not represent said OTC;
      
      an inverse transform which transforms said TCS to a computed region block having computed pel values; and
      
      a replacer which replaces those computed pel values corresponding to said interior pel set with said original pel values to form a modified computed region block (MCRB), said replacer sends the modified computed region block to the forward transform for re-iteration.
  - 15. The encoder of claim 1 further comprising a frame skip unit which receives said sequence of frames and which determines whether each frame in said sequence should be skipped.
  - 16. The encoder of claim 15 further comprising a multiplexer for passing encoded information from said region shape coding unit, said mode coding unit, said motion coding unit, and said region interior coder to said buffer in a predefined order.
  - 17. The encoder of claim 15 wherein said rate controller sends control signals to said frame skip unit in response to the signals received from said buffer.

18. A method of encoding a frame in a video sequence comprising the steps of:
- (a) segmenting the frame into a plurality of arbitrarily-shaped regions which may have different dimensions, each having a corresponding motion vector;
  
  (b) encoding the shape of each arbitrarily-shaped region;
  
  (c) determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded, where said plurality of modes includes a first mode in which the motion vector corresponding to an arbitrarily-shaped region is encoded, a second mode in which the motion vector and a motion compensated prediction error associated with an arbitrarily-shaped region are encoded, and a third intra-frame mode in which the intensity of each pel in an arbitrarily-shaped region is encoded;
  
  (d) encoding the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded;
  
  (e) encoding the motion vector corresponding to one of said plurality of arbitrarily-shaped regions if the region is to be encoded in either said first mode or said second mode;
  
  (f) encoding a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said second mode;
  
  (g) encoding the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said third mode; and
  
  (h) storing information encoded in steps (b), (d), (e), (f) and (g).
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
- - 19. The method of claim 18 wherein the step of segmenting said frame comprises the steps of:
    - (a) dividing the frame into a plurality of smaller regions of predetermined shape and size to form a first segmentation level;
      
      (b) assigning to each of said plurality of smaller regions an initial motion vector;
      
      (c) performing for each of said plurality of smaller regions a motion vector updating routine which updates the motion vector of a smaller region by assigning to it a best motion vector selected from among the initial motion vector assigned to the smaller region, an updated motion vector obtained by performing a block matching technique for the smaller region, and motion vectors of the smaller region'"'"'s neighboring regions, wherein the best motion vector is selected according to a priority scheme and a predetermined threshold value;
      
      (d) dividing each smaller region in the previous segmentation level into a plurality of smaller regions of predetermined shape and size to form a subsequent segmentation level;
      
      (e) assigning to each of the plurality of smaller regions in the subsequent segmentation level an initial motion vector equal to the motion vector of its parent region;
      
      (f) performing the motion vector updating routine for each of said plurality of smaller regions in the subsequent segmentation level;
      
      (g) iteratively performing the steps (d), (e) and (f) until a stop condition is reached;
      
      (h) merging adjacent smaller regions having similar motion vectors to form said plurality of arbitrarily-shaped regions.
  - 20. The method of claim 19 wherein the motion vector updating routine comprises the steps of:
    - determining the smallest matching error from among the matching errors obtained respectively by assigning to the smaller region the following motion vectors;
      
      (a) the initial motion vector assigned to the smaller region;
      
      (b) the updated motion vector obtained by performing a block matching technique for the smaller region; and
      
      (c) the motion vectors of the smaller region'"'"'s neighboring regions;
      
      selecting the initial motion vector as the best motion vector if the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is less than the predetermined threshold value;
      
      selecting the motion vector of one of the neighboring regions as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and the matching error obtained by assigning to the smaller region the motion vector of the neighboring region is less than the predetermined threshold value; and
      
      selecting the matched motion vector as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and each of the matching errors obtained by assigning to the smaller region the motion vector of one of the neighboring region is not less than the predetermined threshold value.
  - 21. The method of claim 20 wherein the steps of encoding the prediction error and encoding the intensity of each pel comprise the steps of:
    - generating original pel values by;
      
      (a) circumscribing said arbitrarily-shaped region with a rectangular region block, thereby creating an internal pel set which lies within said arbitrarily-shaped image and within said region block, and an external pel set which lies outside said arbitrarily-shaped region and within said region block; and
      
      ,(b) initializing pel values of said external pel set by extrapolating the pel values of said internal pel set; and
      
      calculating optimal transform coefficients (OTC) by;
      
      (a) performing a forward transform on said region block to generate transform coefficients;
      
      (b) generating a transform coefficient set (TCS) from said transform coefficients;
      
      (c) performing an inverse transform on said TCS thereby generating a computed region block having computed pel values;
      
      (d) replacing those computed pel values corresponding to said internal pel set with original pel values to form a modified computed region block (MCRB);
      
      (e) determining whether said TCS represents said OTC;
      
      (f) reiterating steps (a) and (b) on said modified computed region block and outputting said TCS when said TCS represents said OTC; and
      
      ,(g) reiterating steps (a) through (g) on said modified computed region block when said TCS values do not represent said OTC.
  - 22. The method of claim 21 wherein said step of performing a forward transform comprises the step of performing a discrete cosine transform (DCT).
  - 23. The method of claim 22 wherein the step of generating a TCS comprises the step of quantizing said transform coefficients.
  - 24. The method of claim 23 wherein the step of generating said TCS further comprises the steps of selecting and retaining those transform coefficients which have high energy according to the energy compaction property of transform coefficients, and zeroing the non-selected transform coefficients.
  - 25. The method of claim 18 wherein the steps of encoding the prediction error and encoding the intensity of each pel comprise the steps of:
    - generating original pel values by;
      
      (a) circumscribing said arbitrarily-shaped region with a rectangular region block, thereby creating an internal pel set which lies within said arbitrarily-shaped image and within said region block, and an external pel set which lies outside said arbitrarily-shaped region and within said region block; and
      
      ,(b) initializing pel values of said external pel set by extrapolating the pel values of said internal pel set; and
      
      calculating optimal transform coefficients (OTC) by;
      
      (a) performing a forward transform on said region block to generate transform coefficients;
      
      (b) generating a transform coefficient set (TCS) from said transform coefficients;
      
      (c) performing an inverse transform on said TCS thereby generating a computed region block having computed pel values;
      
      (d) replacing those computed pel values corresponding to said internal pel set with original pel values to form a modified computed region block (MCRB);
      
      (e) determining whether said TCS represents said OTC;
      
      (f) reiterating steps (a) and (b) on said modified computed region block and outputting said TCS when said TCS represents said OTC; and
      
      ,(g) reiterating steps (a) through (g) on said modified computed region block when said TCS values do not represent said OTC.
  - 26. The method of claim 25 wherein said step of performing a forward transform comprises the step of performing a discrete cosine transform (DCT).
  - 27. The method of claim 26 wherein the step of generating a TCS comprises the step of quantizing said transform coefficients.
  - 28. The method of claim 27 further comprising the step of deciding whether to skip said frame where said step of deciding depends upon the fullness of a buffer which serves as an interface to said decoder.
  - 29. The method of claim 28 wherein the step of determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded is based upon the values of the following normalized sums of absolute differences for the particular region:
    - ##EQU3## where N is the total number of pels in the particular region, i is a given pel in the region, R is the set of all pels in the particular region, I_i is the intensity of the given pel i, m is the mean value of the intensity of all the pels in the particular region, and e_i designates the motion compensated prediction error associated with the given pel i.
  - 30. The method of claim 29 wherein image data of the particular region is encoded in said first mode if the value of NSAD_P is less than a threshold value c and in said second mode when the value of NSAD_P exceeds the threshold value c and is less than the value of NSAD_I by at least a threshold value b.
  - 31. The method of claim 30 wherein said threshold values b and c depend upon the fullness of a buffer which serves as an interface to said decoder.
  - 32. The method of claim 24 wherein the number of selected transform coefficients is based upon the size of the particular arbitrarily-shaped region being encoded.
  - 33. The method of claim 23 wherein the step of quantizing said transform coefficients uses a quantization step size which depends upon the fullness of a buffer which serves as an interface to a decoder.

34. A method of encoding a frame in a video sequence comprising the steps of:
- (a) segmenting the frame into a plurality of arbitrarily-shaped regions which may have different dimensions, each having a corresponding motion vector;
  
  (b) encoding the shape of each arbitrarily-shaped region;
  
  (c) determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded, where said plurality of modes includes a first mode in which the motion vector corresponding to an arbitrarily-shaped region is encoded, a second mode in which the motion vector and a motion compensated prediction error associated with an arbitrarily-shaped region are encoded, and a third intra-frame mode in which the intensity of each pel in an arbitrarily-shaped region is encoded;
  
  (d) encoding the mode in which each of said plurality of arbitrarily-shaped regions is to be encoded;
  
  (e) encoding the motion vector corresponding to one of said plurality of arbitrarily-shaped regions if the region is to be encoded in either said first mode or said second mode;
  
  (f) encoding a motion compensated prediction error associated with one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said second mode;
  
  (g) encoding the intensity of each pel in one of said plurality of arbitrarily-shaped regions if the region is to be encoded in said third mode; and
  
  (h) transmitting information encoded in steps (b), (d), (e), (f) and (g) to a decoder.
- View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
- - 35. The method of claim 34 wherein the step of segmenting said frame comprises the steps of:
    - (a) dividing the frame into a plurality of smaller regions of predetermined shape and size to form a first segmentation level;
      
      (b) assigning to each of said plurality of smaller regions an initial motion vector;
      
      (c) performing for each of said plurality of smaller regions a motion vector updating routine which updates the motion vector of a smaller region by assigning to it a best motion vector selected from among the initial motion vector assigned to the smaller region, an updated motion vector obtained by performing a block matching technique for the smaller region, and motion vectors of the smaller region'"'"'s neighboring regions, wherein the best motion vector is selected according to a priority scheme and a predetermined threshold value;
      
      (d) dividing each smaller region in the previous segmentation level into a plurality of smaller regions of predetermined shape and size to form a subsequent segmentation level;
      
      (e) assigning to each of the plurality of smaller regions in the subsequent segmentation level an initial motion vector equal to the motion vector of its parent region;
      
      (f) performing the motion vector updating routine for each of said plurality of smaller regions in the subsequent segmentation level;
      
      (g) iteratively performing the steps (d), (e) and (f) until a stop condition is reached;
      
      (h) merging adjacent smaller regions having similar motion vectors to form said plurality of arbitrarily-shaped regions.
  - 36. The method of claim 35 wherein the motion vector updating routine comprises the steps of:
    - determining the smallest matching error from among the matching errors obtained respectively by assigning to the smaller region the following motion vectors;
      
      (a) the initial motion vector assigned to the smaller region;
      
      (b) the updated motion vector obtained by performing a block matching technique for the smaller region; and
      
      (c) the motion vectors of the smaller region'"'"'s neighboring regions;
      
      selecting the initial motion vector as the best motion vector if the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is less than the predetermined threshold value;
      
      selecting the motion vector of one of the neighboring regions as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and the matching error obtained by assigning to the smaller region the motion vector of the neighboring region is less than the predetermined threshold value; and
      
      selecting the matched motion vector as the best motion vector if;
      
      (a) the absolute value of the difference between the smallest matching error and the matching error obtained by using the initial motion vector is not less than the predetermined threshold value; and
      
      (b) the absolute value of the difference between the smallest matching error and each of the matching errors obtained by assigning to the smaller region the motion vector of one of the neighboring region is not less than the predetermined threshold value.
  - 37. The method of claim 36 wherein the steps of encoding the prediction error and encoding the intensity of each pel comprise the steps of:
    - generating original pel values by;
      
      (a) circumscribing said arbitrarily-shaped region with a rectangular region block, thereby creating an internal pel set which lies within said arbitrarily-shaped image and within said region block, and an external pel set which lies outside said arbitrarily-shaped region and within said region block; and
      
      ,(b) initializing pel values of said external pel set by extrapolating the pel values of said internal pel set; and
      
      calculating optimal transform coefficients (OTC) by;
      
      (a) performing a forward transform on said region block to generate transform coefficients;
      
      (b) generating a transform coefficient set (TCS) from said transform coefficients;
      
      (c) performing an inverse transform on said TCS thereby generating a computed region block having computed pel values;
      
      (d) replacing those computed pel values corresponding to said internal pel set with original pel values to form a modified computed region block (MCRB);
      
      (e) determining whether said TCS represents said OTC;
      
      (f) reiterating steps (a) and (b) on said modified computed region block and outputting said TCS when said TCS represents said OTC; and
      
      ,(g) reiterating steps (a) through (g) on said modified computed region block when said TCS values do not represent said OTC.
  - 38. The method of claim 37 wherein said step of performing a forward transform comprises the step of performing a discrete cosine transform (DCT).
  - 39. The method of claim 38 wherein the step of generating a TCS comprises the step of quantizing said transform coefficients.
  - 40. The method of claim 39 wherein the step of generating said TCS further comprises the steps of selecting and retaining those transform coefficients which have high energy according to the energy compaction property of transform coefficients, and zeroing the non-selected transform coefficients.
  - 41. The method of claim 34 wherein the steps of encoding the prediction error and encoding the intensity of each pel comprise the steps of:
    - generating original pel values by;
      
      (a) circumscribing said arbitrarily-shaped region with a rectangular region block, thereby creating an internal pel set which lies within said arbitrarily-shaped image and within said region block, and an external pel set which lies outside said arbitrarily-shaped region and within said region block; and
      
      ,(b) initializing pel values of said external pel set by extrapolating the pel values of said internal pel set; and
      
      calculating optimal transform coefficients (OTC) by;
      
      (a) performing a forward transform on said region block to generate transform coefficients;
      
      (b) generating a transform coefficient set (TCS) from said transform coefficients;
      
      (c) performing an inverse transform on said TCS thereby generating a computed region block having computed pel values;
      
      (d) replacing those computed pel values corresponding to said internal pel set with original pel values to form a modified computed region block (MCRB);
      
      (e) determining whether said TCS represents said OTC;
      
      (f) reiterating steps (a) and (b) on said modified computed region block and outputting said TCS when said TCS represents said OTC; and
      
      ,(g) reiterating steps (a) through (g) on said modified computed region block when said TCS values do not represent said OTC.
  - 42. The method of claim 41 wherein said step of performing a forward transform comprises the step of performing a discrete cosine transform (DCT).
  - 43. The method of claim 42 wherein the step of generating a TCS comprises the step of quantizing said transform coefficients.
  - 44. The method of claim 43 further comprising the step of deciding whether to skip said frame where said step of deciding depends upon the fullness of a buffer which serves as an interface to said decoder.
  - 45. The method of claim 44 wherein the step of determining in which of a plurality of modes image data of each arbitrarily-shaped region is to be encoded is based upon the values of the following normalized sums of absolute differences for the particular region:
    - ##EQU4## where N is the total number of pels in the particular region, i is a given pel in the region, R is the set of all pels in the particular region, I_i is the intensity of the given pel i, m is the mean value of the intensity of all the pels in the particular region, and e_i designates the motion compensated prediction error associated with the given pel i.
  - 46. The method of claim 45 wherein image data of the particular region is encoded in said first mode if the value of NSAD_P is less than a threshold value c and in said second mode when the value of NSAD_P exceeds the threshold value c and is less than the value of NSAD_I by at least a threshold value b.
  - 47. The method of claim 46 wherein said threshold values b and c depend upon the fullness of a buffer which serves as an interface to said decoder.
  - 48. The method of claim 40 wherein the number of selected transform coefficients is based upon the size of the particular arbitrarily-shaped region being encoded.
  - 49. The method of claim 39 wherein the step of quantizing said transform coefficients uses a quantization step size which depends upon the fullness of a buffer which serves as an interface to a decoder.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Ebrahimi, Touradj, Horne, Caspar, Chen, Homer H., Haskell, Barin G.
Primary Examiner(s)
AU, AMELIA MEGAN

Application Number

US08/322,893
Time in Patent Office

873 Days
Field of Search

348/394, 348/395, 348/397, 348/400, 348/401, 348/402, 348/405, 348/409, 348/411, 348/412, 348/413, 348/415, 348/416, 348/420, 348/699, 382/236, 382/238, 382/173, 382/240
US Class Current

375/240.14
CPC Class Codes

H04N 19/107   between spatial and tempora...

H04N 19/20   using video object coding

H04N 19/51   Motion estimation or motion...

H04N 19/537   Motion estimation other tha...

H04N 19/543   using regions

H04N 19/61   in combination with predict...

H04N 19/649   the transform being applied...

Method and apparatus for a region-based approach to coding a sequence of video images

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

231 Citations

49 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for a region-based approach to coding a sequence of video images

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

231 Citations

49 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links