Perceptual optimization for model-based video encoding
First Claim
1. A method of encoding a plurality of video frames having non-overlapping target blocks, the method comprising:
- encoding the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by;
configuring the importance maps using temporal information and spatial information; and
computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, wherein;
(i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to a frame quantization parameter (QPframe) of the video frame, resulting in increasing quality for the high-value target blocks, and(ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QPframe), resulting in decreasing quality for the low-value target blocks, and(iii) representing each reduction in block QP of high-value target blocks or increase in block QP of the low-value target blocks in the importance map as a QP offset;
wherein the spatial information for the importance maps is provided by a lookup table based on block variance, the lookup table indicating spatial QP offsets including negative spatial QP offsets for block variances lower than 200 and positive spatial QP offsets for block variances above 400;
wherein the temporal information for the importance maps is provided by an algorithm that determines encoding importance of each target block of the video frame for inter-prediction in future video frames, the algorithm assigning the target blocks spatial QP offsets, including assigning high-value target blocks negative temporal QP offsets; and
wherein total QP offset for a given target block is equal to spatial QP offset of the given target block plus temporal QP offset of the given target block, clipped to maximum and minimum allowable QP values in the video frame.
1 Assignment
0 Petitions
Accused Products
Abstract
Perceptual statistics are used to compute importance maps that indicate which regions of a video frame are important to the human visual system. Importance maps may be generated from encoders that produce motion vectors and employ motion estimation for inter-prediction. The temporal contrast sensitivity function (TCSF) may be computed from the encoder'"'"'s motion vectors. Quality metrics may be used to construct a true motion vector map (TMVM), which refines the TCSF. Spatial complexity maps (SCMs) can be calculated from simple metrics (e.g. block variance, block luminance, SSIM, and edge detection). Importance maps with TCSF, TMVM, and SCM may be used to modify the standard rate-distortion optimization criterion for selecting the optimum encoding solution. Importance maps may modify encoder quantization. The spatial information for the importance maps may be provided by a lookup table based on block variance, where negative and positive spatial QP offsets for block variances are provided.
-
Citations
17 Claims
-
1. A method of encoding a plurality of video frames having non-overlapping target blocks, the method comprising:
-
encoding the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by; configuring the importance maps using temporal information and spatial information; and computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, wherein; (i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to a frame quantization parameter (QPframe) of the video frame, resulting in increasing quality for the high-value target blocks, and (ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QPframe), resulting in decreasing quality for the low-value target blocks, and (iii) representing each reduction in block QP of high-value target blocks or increase in block QP of the low-value target blocks in the importance map as a QP offset; wherein the spatial information for the importance maps is provided by a lookup table based on block variance, the lookup table indicating spatial QP offsets including negative spatial QP offsets for block variances lower than 200 and positive spatial QP offsets for block variances above 400; wherein the temporal information for the importance maps is provided by an algorithm that determines encoding importance of each target block of the video frame for inter-prediction in future video frames, the algorithm assigning the target blocks spatial QP offsets, including assigning high-value target blocks negative temporal QP offsets; and wherein total QP offset for a given target block is equal to spatial QP offset of the given target block plus temporal QP offset of the given target block, clipped to maximum and minimum allowable QP values in the video frame. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer system encoding a plurality of video frames having non-overlapping target blocks, the computer system comprising:
-
at least one processor executing an encoder; the encoder encoding the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by; configuring the importance maps using temporal information and spatial information; and computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, wherein; (i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to a frame quantization parameter (QPframe) of the video frame, resulting in increasing quality for the high-value target blocks, (ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QPframe), resulting in decreasing quality for the low-value target blocks, and (iii) representing each reduction in block QP of high-value target blocks or increase in block QP of the low-value target blocks in the importance map as a QP offset; wherein the spatial information for the importance maps is provided by a lookup table based on block variance, the lookup table indicating spatial QP offsets, including negative spatial QP offsets for block variances lower than 200 and positive spatial QP offsets for block variances above 400; wherein the temporal information for the importance maps is provided by an algorithm that determines encoding importance of each target block of the video frame for inter-prediction in future video frames, the algorithm assigning target blocks spatial QP offsets, including assigning high-value target blocks negative temporal QP offsets; and wherein total QP offset for a given target block is equal to spatial QP offset of the given target block plus temporal QP offset of the given target block, clipped to the maximum and minimum allowable QP values in the video frame. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product having computer readable program code stored on a non-transitory storage medium, the computer readable program code causing a plurality of video frames having non-overlapping target blocks to be encoded, the computer comprising:
-
the computer readable program code implementing an encoder encoding the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block to be encoded in each video frame, the importance maps being formed by the encoder; configuring the importance maps using temporal information and spatial information; and computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, wherein; (i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to a frame quantization parameter (QPframe), resulting in increasing quality for the high-value target blocks, and (ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QPframe), resulting in decreasing quality for the low-value target blocks, and (iii) representing each reduction in block QP of high-value target blocks or increase in block QP of the low-value target blocks in the importance map as a QP offset; wherein the spatial information for the importance maps is provided by a lookup table based on block variance, the lookup table indicating spatial QP offsets including negative spatial QP offsets for block variances lower than 200 and positive spatial QP offsets for block variances above 400; wherein the temporal information for the importance maps is provided by an algorithm that determines encoding importance of each target block of the video frame for inter-prediction in future video frames, the algorithm assigning the target blocks spatial QP offsets, including assigning high-value target blocks negative temporal QP offsets; and wherein total QP offset for a given target block is equal to spatial QP offset of the given target block plus temporal QP offset of the given target block, clipped to maximum and minimum allowable QP values in the video frame.
-
Specification