Prediction apparatus and method for improving coding efficiency in scalable video coding

US 6,043,846 A
Filed: 11/15/1996
Issued: 03/28/2000
Est. Priority Date: 11/15/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A prediction method for improving coding efficiency by reducing temporal redundancy in an enhancement error signal of a multiple layer video encoding system, the prediction method comprising:

dividing an input video sequence into a plurality of video object layers each consisting of a base layer and several enhancement layers, and each of the video object layers are further made up of a series of video object planes (VOP) comprising a two dimensional array of sampled pixel data at a particular time reference;

encoding a VOP of the base layer to obtain a compressed bitstream and a corresponding locally decoded VOP of the base layer;

obtaining a pixel classification criteria from information in current and previous locally decoded VOP of the base layer;

constructing a merged VOP, from a previous locally decoded VOP of an enhancement layer, the previous locally decoded VOP of the base layer, the current locally decoded VOP of the base layer and the pixel classification criteria;

predicting an enhancement layer VOP from;

the merged VOP, the previous VOP in the same layer, and the VOP from the base layer with the same time reference;

entropy coding information for prediction modes as header information for the decoder;

coding prediction errors of the enhancement layer VOP which was predicted according to one of several prediction modes based on a mode decision, together with motion vector information, and transmitting them in a compressed bitstream to a decoder; and

repeating the above prediction steps on the enhancement layers treating a lower of two enhancement layers as the base layer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A prediction method, merged method, has been introduced in the enhancement layer of a multiple layer video coding. This merged method was designed to efficiently handle the prediction of the non moving parts in coding of the enhancement layer VOP or frame. All the information for this merged mode prediction is obtained from the base layer, and no additional side information is transmitted. This prediction mode when used together with combination of the existing forward mode, backward mode, and interpolated mode, can improve the coding efficiency for enhancement layer video coding, especially in low bit rate coding. The method can be used in most multiple layer video coding schemes, especially in spatial scalability video coding.

Citations

36 Claims

1. A prediction method for improving coding efficiency by reducing temporal redundancy in an enhancement error signal of a multiple layer video encoding system, the prediction method comprising:
- dividing an input video sequence into a plurality of video object layers each consisting of a base layer and several enhancement layers, and each of the video object layers are further made up of a series of video object planes (VOP) comprising a two dimensional array of sampled pixel data at a particular time reference;
  
  encoding a VOP of the base layer to obtain a compressed bitstream and a corresponding locally decoded VOP of the base layer;
  
  obtaining a pixel classification criteria from information in current and previous locally decoded VOP of the base layer;
  
  constructing a merged VOP, from a previous locally decoded VOP of an enhancement layer, the previous locally decoded VOP of the base layer, the current locally decoded VOP of the base layer and the pixel classification criteria;
  
  predicting an enhancement layer VOP from;
  
  the merged VOP, the previous VOP in the same layer, and the VOP from the base layer with the same time reference;
  
  entropy coding information for prediction modes as header information for the decoder;
  
  coding prediction errors of the enhancement layer VOP which was predicted according to one of several prediction modes based on a mode decision, together with motion vector information, and transmitting them in a compressed bitstream to a decoder; and
  
  repeating the above prediction steps on the enhancement layers treating a lower of two enhancement layers as the base layer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 26, 31, 32, 33, 34, 35)
- - 2. A prediction method for improving coding efficiency in multiple layer video coding according to claim 1 where obtaining the pixel classification criteria from the information in the current and previous locally decoded VOP of the base layer, further comprises:
    - up-sampling the base layer VOP by a filtering technique, if necessary, to form a scaled VOP of the same dimension as the enhancement layer; and
      
      classifying each pixel in the current VOP as changed or unchanged based on the magnitude of the difference between the current pixel and the pixel at a corresponding location in the previous VOP.
  - 3. A prediction method for improving coding efficiency in multiple layer video coding according to claim 2 where constructing a merged VOP, from the previous locally decoded VOP of the enhancement layer, the previous locally decoded VOP of the base layer, the current locally decoded VOP of the base layer and the pixel classification criteria, further comprises:
    - selecting a first group of pixels for the merged VOP from the previous locally decoded VOP of the enhancement layer VOP at pixel locations classified as unchanged;
      
      selecting a second group of pixels for the merged VOP from the up-sampled VOP of the same time reference, at pixel locations classified as changed; and
      
      constructing the merged VOP by merging the groups of first and second selected pixels together.
  - 4. A prediction method for improving coding efficiency in multiple layer video coding according to claim 3 where the prediction of the enhancement layer VOP further comprises:
    - generating motion vectors from the previous locally decoded VOP of the enhancement layer, to be used for a forward motion compensation prediction mode;
      
      generating the motion vectors from the up-sampled VOP with the same time reference as the current predicted VOP, to be used a backward motion compensation prediction mode;
      
      searching for and obtaining a forward motion compensated macro block from previous locally decoded VOP of the enhancement layer, by using the forward motion vectors generated in the forward prediction mode for a current macro block;
      
      searching for and obtaining a backward motion compensated macro block from the up-sampled VOP, of the same time reference as the current VOP, by using the backward motion vectors generated in the backward prediction mode for the current macro block;
      
      averaging the forward motion compensated macro block and backward motion compensated macro block to obtain an interpolated macro block, for an interpolated prediction mode; and
      
      obtaining a macro block from the merged VOP for the merge prediction mode.
  - 5. A prediction method according to claim 4 where the prediction of the enhancement layer further comprises:
    - calculating the absolute difference between the macro block from the current VOP and the corresponding forward motion compensated macro block of the forward prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding backward motion compensated macro block of the backward prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding interpolated macro block of the interpolated prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding merged macro block in the merged prediction mode;
      
      selecting the prediction mode which results in the minimum absolute difference; and
      
      predicting each of the macro blocks by using the selected prediction mode.
  - 6. A prediction method according to claim 5 where the prediction of the enhancement layer is biased towards selecting the merged mode.
  - 7. A prediction method for improving coding efficiency in multiple layer video coding according to claim 2 where classifying each pixel in the current VOP into changed and unchanged, further comprises:
    - comparing the magnitude of the difference to a predefined threshold;
      
      classifying the pixel as unchanged when the magnitude of the difference is less than or equal to the threshold; and
      
      classifying the pixel as changed when the magnitude of the difference is greater than the threshold.
  - 8. A prediction method according to claim 4 wherein merging of the groups of selected pixels employs a smoothing filter or weighting function for pixels at a boundary between the groups of selected pixels.
  - 9. A prediction method for improving coding efficiency in multiple layer video coding according to claim 5 where the prediction mode information is entropy coded, comprising:
    - obtaining statistics of the prediction mode selection;
      
      assigning less bits for the prediction mode with the higher possibility of being selected, and more bits for the prediction mode with the lower possibility of being selected; and
      
      assigning a default made for macro-blocks that are not transmitted as the merged prediction mode with no prediction error coded.
  - 10. A prediction method for improving coding efficiency in multiple layer video coding according to claim 1 where obtaining the pixel classification criteria from the information in the current and previous locally decoded VOP of the base layer, further comprises:
    - up-sampling the base layer VOP by a filtering technique, if necessary, to form a scaled VOP of the same dimension as the enhancement layer; and
      
      classifying each pixel in the current VOP as changed or unchanged based on the magnitude of the difference between the weighted sum of the current and surrounding pixels and the weighted sum of the pixels at corresponding locations in the previous VOP.
  - 11. A prediction method for improving coding efficiency in multiple layer video coding according to claim 10 where constructing a merged VOP, from the previous locally decoded VOP of the enhancement layer, the previous locally decoded COP of the base layer, the current locally decoded VOP of the base layer and the pixel classification criteria, further comprises:
    - selecting a first group of pixels for the merged VOP from the previous locally decoded VOP of the enhancement layer VOP at pixel locations classified as unchanged;
      
      selecting a second group of pixels for the merged VOP from the up-sampled VOP of the same time reference at pixel locations classified as changed; and
      
      constructing the merged VOP by merging the groups of first and second selected pixels together.
  - 12. A prediction method for improving coding efficiency in multiple layer video coding according to claim 11 where the prediction of the enhancement layer VOP further comprises:
    - generating motion vectors from the previous locally decoded VOP of the enhancement layer, to be used for a forward motion compensation prediction mode;
      
      generating the motion vectors from the up-sampled VOP with the same time reference as the current predicted VOP, to be used for a backward motion compensation prediction mode;
      
      searching for and obtaining a forward motion compensated macro block from the previous locally decoded VOP of the enhancement layer, by using the forward motion vectors generated in the forward prediction mode for a current macro block;
      
      searching for and obtaining a backward motion compensated macro block from the up-sampled VOP, of the same time reference as the current VOP, by using the backward motion vectors generated in the backward prediction mode for the current macro block;
      
      averaging the forward motion compensated macro block and backward motion compensated macro block to obtain an interpolated macro block, for an interpolated prediction mode; and
      
      obtaining a macro block from the merged VOP for the merge prediction mode.
  - 13. A prediction method according to claim 12 where the prediction of the enhancement layer further comprises:
    - calculating the absolute difference between the macro block from the current VOP and the corresponding forward motion compensated macro block of the forward prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding backward motion compensated macro block of the backward prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding interpolated macro block of the interpolated prediction mode;
      
      calculating the absolute difference between the macro block from the current VOP and the corresponding merged macro block in the merged prediction mode;
      
      selecting the prediction mode which results in the minimum absolute difference; and
      
      predicting each of the macro blocks by using the selected prediction mode.
  - 14. The prediction method according to claim 13 where the prediction of the enhancement layer is biased towards selecting the merged mode.
  - 15. A prediction method for improving coding efficiency in multiple layer video coding according to claim 10, where classifying each pixel in the current VOP into changed and unchanged, further comprises:
    - comparing the magnitude of the difference value to a predefined threshold;
      
      classifying the pixel as unchanged when the magnitude of the difference is less than or equal to the threshold; and
      
      classifying the pixel as changed when the magnitude of the difference is greater than the threshold.
  - 16. A prediction method according to claim 11 where merging of the groups of selected pixels further employs a smoothing filter or weighting function for pixels at a boundary between the groups of selected pixels.
  - 17. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 1, where the constructing of the merged VOP is done at the macro block level.
  - 18. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 1, where the prediction mode consists of a merge mode only.
  - 19. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 1, where the prediction mode consists of a merge mode and at least one of a forward, backward and interpolated modes.
  - 20. A prediction method according to claim 1 where the VOP consists of arbitrarily shaped video objects.
  - 21. A prediction method according to claim 1 where the VOP consists of rectangular shaped video frames.
  - 26. A prediction method according to claim 13 where merging of the groups of selected pixels further employs a smoothing filter or weighting function for pixels at a boundary between the groups of selected pixels.
  - 31. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 21, where the constructing of the merged VOP is done at the macro block level.
  - 32. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 21, where the prediction mode consists of a merge mode only.
  - 33. A prediction method for improving coding efficiency in multiple layer video encoding and decoding according to claim 21, where the prediction mode consists of the merge mode and at least one of a forward, backward and interpolated modes.
  - 34. A prediction method according to claim 21 where the VOP consists of arbitrarily shaped video objects.
  - 35. A prediction method according to claim 21 where the VOP consists of rectangular shaped video frames.

22. A prediction method for improving coding efficiency by reducing temporal redundancy in an enhancement error signal of a multiple layer video decoding system, the method comprising:
- decoding a VOP of a base layer from a compressed bitstream to obtain a corresponding decoded VOP of the base layer;
  
  obtaining a pixel classification criteria from information in current and previous decoded VOP of the base layer;
  
  constructing a merged VOP, from a previous decoded VOP of an enhancement layer, the previous decoded VOP of the base layer, the current decoded VOP of the base layer and the pixel classification criteria;
  
  decoding information for prediction modes from header information transmitted by the encoder;
  
  decoding prediction errors of an enhancement layer VOP which was predicted according to a transmitted prediction mode, together with motion vector information received in the compressed bitstream;
  
  reconstructing the enhancement layer VOP from the merged VOP, the previous VOP in the same layer, and the VOP from the base layer with a same time reference; and
  
  repeating the above reconstruction steps on the enhancement layer treating a lower of two enhancement layers as the base layer.
- View Dependent Claims (23, 24, 25, 27, 28, 29, 30)
- - 23. A prediction method for improving coding efficiency in multiple layer video decoding according to claim 22 where obtaining the pixel classification criteria from the information in the current and previous decoded VOP of the base layer, further comprises:
    - up-sampling the base layer VOP by a filtering technique, if necessary, to form a scaled VOP of the same dimension as the enhancement layer; and
      
      classifying each pixel in the current VOP as changed or unchanged based on the magnitude of the difference between the current pixel and the pixel at a corresponding location in the previous VOP.
  - 24. A prediction method for improving coding efficiency in multiple layer video decoding according to claim 23 where constructing a merged VOP, from the previous decoded VOP of the enhancement layer, the previous decoded VOP of the base layer, the current decoded VOP of the base layer and the pixel classification criteria, further comprises:
    - selecting a first group of pixels for the merged VOP from the previous decoded VOP of the enhancement layer VOP at pixel locations classified as unchanged;
      
      selecting a second group of pixels for the merged VOP from the up-sampled VOP of the same time reference, at pixel locations classified as changed; and
      
      constructing the merged VOP by merging the groups of first and second selected pixels together.
  - 25. A prediction method for improving coding efficiency in multiple layer video coding according to claim 24 where the reconstruction of the enhancement layer VOP further comprises:
    - using the decoded information of the prediction mode and motion vectors to obtain one of the following prediction macro-blocks,a forward motion compensated macro block from the previous locally decoded VOP of the enhancement layer,a backward motion compensated macro block from the up-sampled VOP, of the same time reference as the current VOP,an interpolated macro block which is the average of the forward motion compensated macro block and backward motion compensated macro block, ora macro block from the merged VOP; and
      
      adding the obtained prediction macro-block to a decoded prediction difference to obtain a reconstructed macro-block.
  - 27. A prediction method for improving coding efficiency in multiple layer video decoding according to claim 22 where obtaining the pixel classification criteria from the information in the current and previous decoded VOP of the base layer, further comprises:
    - up-sampling the base layer VOP by a filtering technique, if necessary, to form a scaled VOP of the same dimension as the enhancement layer; and
      
      classifying each pixel in the current VOP as changed or unchanged based on the magnitude of the difference between the weighted sum of the current and surrounding pixels and the weighted sum of the pixels at corresponding locations in the previous VOP.
  - 28. A prediction method for improving coding efficiency in multiple layer video decoding according to claim 27 where constructing a merged VOP, from the previous decoded VOP of the enhancement layer, the previous decoded VOP of the base layer, the current decoded VOP of the base layer and the pixel classification criteria, further comprises:
    - selecting a first group of pixels for the merged VOP from the previous decoded VOP of the enhancement layer VOP at pixel locations classified as unchanged;
      
      selecting a second group of pixels for the merged VOP from the up-sampled VOP of the same time reference at pixel locations classified as changed; and
      
      constructing the merged VOP by merging the groups of first and second selected pixels together.
  - 29. A prediction method for improving coding efficiency in multiple layer video coding according to claim 28 where the reconstruction of the enhancement layer VOP, further comprises:
    - using the decoded information of the prediction mode and motion vectors to obtain one of the following prediction macro-blocks,a forward motion compensated macro block from the previous locally decoded VOP of the enhancement layer,a backward motion compensated macro block from the up-sampled VOP, of the same time reference as the current VOP,an interpolated macro-block which is the average of the forward motion compensated macro block and backward motion compensated macro block, ora macro block from the merged VOP; and
      
      adding the obtained prediction macro-bock to a decoded prediction difference to obtain a reconstruction macro-bock.
  - 30. A prediction method according to claim 28 where merging of the groups of selected pixels further employs a smoothing filter or weighting function for pixels at a boundary between the groups of selected pixels.

36. A prediction apparatus for improving coding efficiency by reducing the temporal redundancy in a enhancement error signal of a multiple layer video encoding system, the apparatus comprising:
- a divider which divides an input video sequence into a plurality of video object layers consisting of a base layer and several enhancement layers, and each video object layer further comprises a series of video object planes (VOP) comprising a two dimensional array of sampled pixel data at a particular time reference;
  
  an encoder which encodes a VOP of the base layer to obtain a compressed bitstream and a corresponding locally decoded VOP of the base layer,an obtaining part which obtains a pixel classification criteria from information in the current and previous locally decoded VOP of the base layer;
  
  a constructing part which constructs a merged VOP, from a previous locally decoded VOP of an enhancement layer, the previous locally decoded VOP of the base layer, the current locally decoded VOP of the base layer and the pixel classification criteria;
  
  a prediction part which predicts an enhancement layer VOP from;
  
  the merged VOP, the previous VOP in the same layer, and the VOP from the base layer with the same time reference;
  
  a coder which entropy codes information for prediction modes as header information for the decoder; and
  
  a coder which codes prediction errors of the enhancement layer VOP which was predicted according to one of several prediction modes based on a mode decision, together with motion vector information, and transmitting them in a compressed bitstream to a decoder;
  
  wherein the prediction apparatus further operates by treating a lower of two enhancement layers as the base layer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Shen, Sheng Mei, Tan, Thiow Keng
Primary Examiner(s)
Tung, Bryan
Assistant Examiner(s)
VO, TUNG T

Application Number

US08/749,849
Time in Patent Office

1,229 Days
Field of Search

348/397, 348/398, 348/399, 348/409, 348/415, 348/413, 348/416, 348/384, 348/390, 348/400, 348/401, 348/402, 348/699, 358/135, 358/136, 382/232, 382/236, 382/238
US Class Current

348/409.1
CPC Class Codes

H04N 19/29   involving scalability at th...

H04N 19/33   in the spatial domain

H04N 19/51   Motion estimation or motion...

Prediction apparatus and method for improving coding efficiency in scalable video coding

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

36 Claims

Specification

Solutions

Use Cases

Quick Links

Prediction apparatus and method for improving coding efficiency in scalable video coding

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

36 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links