Skin tone and feature detection for video conferencing compression

US 8,588,309 B2
Filed: 04/07/2010
Issued: 11/19/2013
Est. Priority Date: 04/07/2010
Status: Active Grant

First Claim

Patent Images

1. A processor programmed to perform a video compression method, the method comprising:

determining with the processor any first pixels in a frame having color in a predetermined tone region;

determining with the processor any second pixels in the frame that are part of at least one facial feature;

for each pixel in the frame;

assigning a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any of the first pixels, but the pixel is not one of the second pixels;

assigning a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels;

assigning a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel;

assigning a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels;

averaging the assigned pixel values on a block by block basis within the frame to generate respective block scores;

compressing with the processor the video frame using coding parameters selected based on the block scores.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In many videoconferencing applications, bandwidth is at a premium, and thus, it is important to encode a given video frame intelligently. It is often desirable that a larger amount of information be spent encoding the more important parts of the video frame, e.g., human facial features, whereas the less important parts of the video frame can be compressed at higher rates. Thus, there is need for an apparatus, computer readable medium, processor, and method for intelligent skin tone and facial feature aware videoconferencing compression that can “suggest” intelligent macroblock compression ratios to a video encoder. The suggestion of compression rates can be based at least in part on a determination of which macroblocks in a given video frame are likely to contain skin tones, likely to contain features (e.g., edges), likely to contain features in or near skin tone regions, or likely to contain neither skin tones nor features.

Citations

25 Claims

1. A processor programmed to perform a video compression method, the method comprising:
- determining with the processor any first pixels in a frame having color in a predetermined tone region;
  
  determining with the processor any second pixels in the frame that are part of at least one facial feature;
  
  for each pixel in the frame;
  
  assigning a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any of the first pixels, but the pixel is not one of the second pixels;
  
  assigning a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels;
  
  assigning a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel;
  
  assigning a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels;
  
  averaging the assigned pixel values on a block by block basis within the frame to generate respective block scores;
  
  compressing with the processor the video frame using coding parameters selected based on the block scores.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 25)
- - 2. The processor of claim 1, wherein determining the first pixels comprises comparing each pixel'"'"'s color to a polygonal tone region.
  - 3. The processor of claim 2, wherein the predetermined tone region comprises CbCr values indicative of human skin tones.
  - 4. The processor of claim 2, wherein the predetermined tone region comprises RGB values indicative of human skin tones.
  - 5. The processor of claim 1, wherein determining the second pixels comprises carrying out an edge detection process on the pixels in the frame.
  - 6. The processor of claim 1, wherein determining the second pixels comprises carrying out a feature detection process on the pixels in the luma space.
  - 7. The processor of claim 1, wherein the first, second, third and fourth values are all different values from each other.
  - 8. The processor of claim 1, wherein the second value is higher than the first value.
  - 9. The processor of claim 8, wherein the third value is higher than the second value.
  - 10. The processor of claim 9, wherein the first value is higher than the fourth value.
  - 11. The processor of claim 1, wherein compressing the blocks comprises:
    - comparing the score of a block to the scores of one or more neighboring blocks; and
      
      adjusting the score of the given block based on a difference between the scores of the respective block and its neighbor(s).
  - 12. The processor of claim 10, wherein compressing the frame comprises compressing the blocks having a higher score with less compression than the blocks having a lower score.
  - 13. The processor of claim 1, wherein compressing the blocks comprises:
    - comparing the score of a given one of the blocks to the scores of one or more neighboring blocks; and
      
      adjusting the score of the given block based on a discrepancy.
  - 14. The processor of claim 13, wherein the discrepancy is indicative of the given block having a lower or higher score compared to the one or more neighboring blocks.
  - 15. The processor of claim 1, wherein each of the blocks comprises a macroblock.
  - 16. The processor of claim 1, wherein compressing the regions comprises:
    - adjusting the score for each block having a score that differs from the score of one or more neighboring blocks by more than a threshold value to be equal to the average score of the neighboring blocks.
  - 17. A non-transitory storage medium having a computer readable program code embodied therein, wherein the computer readable program code is adapted to be executed to implement the method performed by the programmed processor of claim 1.
  - 25. The processor of claim 1, wherein the processor is further configured to:
    - determine if the pixel is within a specified threshold distance to any first pixel by comparing the location of that respective pixel with the location of any first pixel.

18. An apparatus, comprising:
- an image sensor for obtaining video data;
  
  memory operatively coupled to the image sensor; and
  
  a processor operatively coupled to the memory and the image sensor and programmed to encode the video data, the processor configured to;
  
  determine any first pixels in a frame having color in a predetermined tone region;
  
  determine any second pixels in the frame that are part of at least one facial feature;
  
  for each pixel in the frame;
  
  assign a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any one of the first pixels, but the pixel is not one of the second pixels;
  
  assign a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels;
  
  assign a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel;
  
  assign a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels;
  
  average the assigned pixel values on a block by block basis within the frame to generate respective block scores; and
  
  compress the video frame using coding parameters based at least in part on the block scores.
- View Dependent Claims (19, 20, 21, 22, 23, 24)
- - 19. The apparatus of claim 18, wherein the apparatus comprises at least one of the following:
    - a digital camera, digital video camera, mobile phone, personal data assistant, portable music player, and computer.
  - 20. The apparatus of claim 18, wherein the second value is higher than the first value.
  - 21. The apparatus of claim 20, wherein the third value is higher than the second value.
  - 22. The apparatus of claim 21, wherein the first value is higher than the fourth value.
  - 23. The apparatus of claim 21, wherein the processor is further configured to:
    - compare the score of a given one of the blocks to the scores of one or more neighboring blocks;
      
      adjust the score of the given block based on a discrepancy between the scores; and
      
      adjust the coding parameters based on the new scores.
  - 24. The apparatus of claim 22, wherein the processor is further configured to:
    - compress the blocks having a higher score with a higher bit rate than the blocks having a lower score.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Doepke, Frank
Primary Examiner(s)
Dastouri, Mehrdad
Assistant Examiner(s)
HALLENBECK-HUBER, JEREMIAH CHARLES

Application Number

US12/755,551
Publication Number

US 20110249756A1
Time in Patent Office

1,322 Days
Field of Search

375/240.24
US Class Current

375/240.24
CPC Class Codes

G06V 40/162   using pixel segmentation or...

H04N 19/115   Selection of the code volum...

H04N 19/14   Coding unit complexity, e.g...

H04N 19/167   Position within a video ima...

H04N 19/176   the region being a block, e...

H04N 7/147   Communication arrangements,...

Skin tone and feature detection for video conferencing compression

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Skin tone and feature detection for video conferencing compression

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links