Perceptual optimization for model-based video encoding

US 10,091,507 B2
Filed: 09/03/2015
Issued: 10/02/2018
Est. Priority Date: 03/10/2014
Status: Active Grant

First Claim

Patent Images

1. A method of encoding a plurality of video frames having non-overlapping target blocks, the method comprising:

encoding, via an encoder, the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by;

configuring the importance maps using temporal information and spatial information; and

computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, including;

(i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to the frame quantization parameter (QP_frame), resulting in increasing quality for the high-value target blocks, and(ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QP_frame), resulting in decreasing quality for the low-value target blocks;

wherein the spatial information is provided by a rule-based spatial complexity map (SCM), an initial step of the SCM determines target blocks in a video frame that have a high-variance relative to an average block variance of the video frame (var_frame); and

wherein, for each determined target block, the SCM;

(a) assigns a higher value to a block quantization parameter (QP) of the determined target block than set for the frame quantization parameter (QP_frame) of the video frame,(b) scales the assigned value of the block QP (QP_block) of the determined target block linearly between QP_frameand a maximum quantization parameter (QP_max) based on difference in variance of the determined target block (var_block) and var_frame, and(c) refines the QP_blockby a temporal contrast sensitivity function (TCSF) and a true motion vector map (TMVM), such that if the TMVM identifies the determined target block as foreground data and the TCSF has a log contrast sensitivity value less than 0.5, raising the QP_blockby 2.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Perceptual statistics may be used to compute importance maps that indicate which regions of a video frame are important to the human visual system. Importance maps may be applied to the video encoding process to enhance the quality of encoded bitstreams. The temporal contrast sensitivity function (TCSF) may be computed from the encoder'"'"'s motion vectors. Motion vector quality metrics may be used to construct a true motion vector map (TMVM) that can be used to refine the TCSF. Spatial complexity maps (SCMs) can be calculated from metrics such as block variance, block luminance, SSIM, and edge strength, and the SCMs can be combined with the TCSF to obtain a unified importance map. Importance maps may be used to improve encoding by modifying the criterion for selecting optimum encoding solutions or by modifying the quantization for each target block to be encoded.

Citations

30 Claims

1. A method of encoding a plurality of video frames having non-overlapping target blocks, the method comprising:
- encoding, via an encoder, the plurality of video frames using importance maps, such that the importance maps modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by;
  
  configuring the importance maps using temporal information and spatial information; and
  
  computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, including;
  
  (i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to the frame quantization parameter (QP_frame), resulting in increasing quality for the high-value target blocks, and(ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QP_frame), resulting in decreasing quality for the low-value target blocks;
  
  wherein the spatial information is provided by a rule-based spatial complexity map (SCM), an initial step of the SCM determines target blocks in a video frame that have a high-variance relative to an average block variance of the video frame (var_frame); and
  
  wherein, for each determined target block, the SCM;
  
  (a) assigns a higher value to a block quantization parameter (QP) of the determined target block than set for the frame quantization parameter (QP_frame) of the video frame,(b) scales the assigned value of the block QP (QP_block) of the determined target block linearly between QP_frameand a maximum quantization parameter (QP_max) based on difference in variance of the determined target block (var_block) and var_frame, and(c) refines the QP_blockby a temporal contrast sensitivity function (TCSF) and a true motion vector map (TMVM), such that if the TMVM identifies the determined target block as foreground data and the TCSF has a log contrast sensitivity value less than 0.5, raising the QP_blockby 2.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method as in claim 1, wherein the temporal information is provided by:
    - the temporal contrast sensitivity function (TCSF) identifying target blocks in the video frame most temporally noticeable to a human observer, andthe true motion vector map (TMVM) identifying target blocks in the video frame corresponding to foreground data, the TCSF being valid for the target blocks in the video frame identified by the TMVM as foreground data.
  - 3. The method as in claim 1, wherein the SCM includes luminance masking that adjusts QP_blockto QP_maxfor target blocks with luminance above 170 or luminance below 60.
  - 4. The method as in claim 1, wherein the SCM dynamically determines QP_maxbased on quality of the encoded video frames, the dynamic determination of QP_maxcomprising:
    - measuring the quality using an average structural similarity (SSIM) calculation of target blocks in Intra (I) frames, and average block variance (var_frame) of the I frames; and
      
      adjusting QP_maxrelative to QP_framebased on the measured quality.
  - 5. The method as in claim 1, further comprising:
    - assigning target blocks having very-low-variance that is lower than a variance threshold indicative of a flat data block fixed QP values (QP_block), such that lower the target block variance, lower the assigned fixed value of QP_block.
  - 6. The method as in claim 5, wherein the assignment of the fixed, QP values (QP_block) comprises:
    - fixing the assignment for I frames; and
      
      determining the assignment for P and B frames using ipratio and pbratio parameters.
  - 7. The method as in claim 5, further comprising:
    - examining target blocks that qualify as low-variance for having variance lower than the average block variance and not qualify as very-low-variance to determine whether quality enhancement is needed for the examined target blocks, for each of the examined target blocks;
      
      calculating an initial estimate of the block QP (QP_block) of the examined target block by averaging the QP values of encoded neighboring target blocks to the left, top-left, right, and top-right of the examined target block;
      
      calculating an estimate of the SSIM of the examined target block (SSIM_est) from the SSIM values of the encoded neighboring target blocks to the left, top-left, right, and top-right of the examined target block; and
      
      lowering the value of QP_blockby 2, if SSIM_estis lower than 0.9.
  - 8. The method as in claim 7, wherein the quality enhancement is only applied to target blocks that are identified as foreground data by the TMVM and that the TCSF has a log contrast sensitivity value greater than 0.8.
  - 9. The method as in claim 2, further comprising:
    - computing temporal frequency of the TCSF by;
      
      (i) using SSIM in the colorspace domain between a given target block and a reference block of the given target block to approximate wavelength and (ii) using motion vector magnitudes and framerate to approximate velocity.
  - 10. The method as in claim 2, wherein the TCSF is calculated over multiple frames, such that the TCSF for a given video frame is a weighted average of the TCSF maps over a currently encoded video frame and previous encoded video frames, with weighting applied based on encoding order of video frames.
  - 11. The method as in claim 2, wherein the TMVM is set to 1 only for foreground data.
  - 12. The method as in claim 11, further comprising:
    - identifying foreground data by computing difference between an encoder motion vector for a given target block and a global motion vector for the given target block, and determining the given target block to be foreground data based on magnitude of the difference.
  - 13. The method as in claim 12, further comprising:
    - for target blocks that are identified as foreground data;
      
      subtracting the encoder motion vector from the global motion vector to obtain a differential motion vector, andusing magnitude of the obtained differential motion vector in calculating the temporal frequency of the TCSF.
  - 14. The method as in claim 2, wherein the TCSF is computed from motion vectors from the encoder.
  - 15. The method as in claim 1, wherein the importance map configured with the temporal information and the spatial information being a unified importance map.

16. A system of encoding video data, the system comprising:
- An encoder using importance maps to encode a plurality of video frames having non-overlapping target blocks; and
  
  the importance maps configured to modify quantization affecting encoding quality of each target block being encoded in each video frame, the importance maps being formed by;
  
  configuring the importance maps using temporal information and spatial information; and
  
  the encoder computationally causing the importance maps to indicate which parts of a video frame in the plurality of video frames are most noticeable to human perception, including;
  
  (i) in target blocks where the importance maps take on high values that are higher than an average value in a value range of the importance map based on perceptual statistics, reducing a block quantization parameter (QP) of each high-value target block relative to the frame quantization parameter (QP_frame), resulting in increasing quality for the high-value target blocks, and(ii) in target blocks where the importance maps take on low values that are lower than an average value in a value range of the importance map based on perceptual statistics, increasing a block quantization parameter (QP) of each low-value target block relative to the frame quantization parameter (QP_frame), resulting in decreasing quality for the low-value target blocks;
  
  wherein the spatial information is provided by a rule-based spatial complexity map (SCM), an initial step of the SCM determines target blocks in a video frame that have a high-variance relative to an average block variance of the video frame (var_frame); and
  
  wherein, for each determined target block, the SCM;
  
  (a) assigns a higher value to a block quantization parameter (QP) of the determined target block than set for the frame quantization parameter (QP_frame) of the video frame,(b) scales the assigned value of the block QP (QP_block) of the determined target block linearly between QP_frameand a maximum quantization parameter (QP_max) based on difference in variance of the determined target block (var_block) and var_frame, and(c) refines the QP_blockby a temporal contrast sensitivity function (TCSF) and a true motion vector map (TMVM), such that if the TMVM identifies the determined target block as foreground data and the TCSF has a log contrast sensitivity value less than 0.5, raising the QP_blockby 2.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
- - 17. The system as in claim 16, wherein the temporal information is provided by:
    - the temporal contrast sensitivity function (TCSF) identifying target blocks in the video frame most temporally noticeable to a human observer, andthe true motion vector map (TMVM) identifying target blocks in the video frame corresponding to foreground data, the TCSF being valid for the target blocks identified by the TMVM as foreground data.
  - 18. The system as in claim 16, wherein the SCM includes luminance masking that adjusts QP_blockto QP_maxfor target blocks with luminance above 170 or luminance below 60.
  - 19. The system as in claim 16, wherein the SCM dynamically determines QP_maxbased on quality of the encoded video frames, the dynamic determination of QP_maxcomprising:
    - measuring the quality using an average structural similarity (SSIM) calculation of target blocks in Intra (I) frames, and average block variance (var_frame) of the I frames; and
      
      adjusting QP_maxrelative to QP_framebased on the measured quality.
  - 20. The system as in claim 16, wherein the encoder further:
    - assigning target blocks having very-low-variance that is lower than a variance threshold indicative of a flat data block fixed QP values (QP_block), such that lower the target block variance, lower the assigned fixed value of QP_block.
  - 21. The system as in claim 20, wherein the encoder assigns the fixed, low QP values (QP_block) by:
    - fixing the assignment for I frames; and
      
      determining the assignment for P and B frames using ipratio and pbratio parameters.
  - 22. The system as in claim 16, wherein further the encoder further:
    - examining target blocks that qualify as low-variance for having variance lower than the average block variance and not qualify as very-low-variance to determine whether quality enhancement is needed for the examined target blocks, for each of the examined target blocks;
      
      calculating an initial estimate of the block QP (QP_block) of the examined target block by averaging the QP values of encoded neighboring target blocks to the left, top-left, right, and top-right of the examined target block;
      
      calculating an estimate of the SSIM of the examined target block (SSIM_est) from the SSIM values of the encoded neighboring target blocks to the left, top-left, right, and top-right of the examined target block; and
      
      lowering the value of QP_blockby 2, if SSIM_estis lower than 0.9.
  - 23. The system as in claim 22, wherein the quality enhancement is only applied to target blocks that are identified as foreground data by the TMVM and that the TCSF has a log contrast sensitivity value greater than 0.8.
  - 24. The system as in claim 17, wherein the encoder further:
    - computing temporal frequency of the TCSF by;
      
      (i) using SSIM in the colorspace domain between a given target block and a reference block of the given target block to approximate wavelength and (ii) using motion vector magnitudes and framerate to approximate velocity.
  - 25. The system as in claim 17, wherein the encoder further:
    - calculating the TCSF over multiple frames, such that the TCSF for a given video frame is a weighted average of the TCSF maps over a currently encoded video frame and previous encoded video frames, with weighting applied based on encoding order of the video frames.
  - 26. The system as in claim 17, wherein the TMVM is set to 1 only for foreground data.
  - 27. The system as in claim 26, wherein the encoder further:
    - identifying foreground data by computing difference between an encoder motion vector for a given target block and a global motion vector for the given target block, and determining the given target block to be foreground data based on magnitude of the difference.
  - 28. The system as in claim 17, wherein the encoder further:
    - for target blocks that are identified as foreground data;
      
      subtracting the encoder motion vector from the global motion vector to obtain a differential motion vector, andusing magnitude of the obtained differential motion vector in calculating the temporal frequency of the TCSF.
  - 29. The system as in claim 17, wherein the encoder computing the TCSF from motion vectors.
  - 30. The system as in claim 16, wherein the importance map configured with the temporal information and spatial information being a unified importance map.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Euclid Discoveries LLC
Original Assignee
Euclid Discoveries LLC
Inventors
Lee, Nigel, Park, Sangseok, Tun, Myo, Kottke, Dane P., Lee, Jeyun, Weed, Christopher
Primary Examiner(s)
Vaughn, Jr., William C
Assistant Examiner(s)
Uhl, Lindsay

Application Number

US14/845,067
Publication Number

US 20160073111A1
Time in Patent Office

1,125 Days
Field of Search
US Class Current
CPC Class Codes

H04N 19/124   Quantisation

H04N 19/139   Analysis of motion vectors,...

H04N 19/14   Coding unit complexity, e.g...

H04N 19/176   the region being a block, e...

H04N 19/527   Global motion vector estima...

Perceptual optimization for model-based video encoding

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Perceptual optimization for model-based video encoding

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links