Skin tone and feature detection for video conferencing compression
First Claim
1. A processor programmed to perform a video compression method, the method comprising:
- determining with the processor any first pixels in a frame having color in a predetermined tone region;
determining with the processor any second pixels in the frame that are part of at least one facial feature;
for each pixel in the frame;
assigning a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any of the first pixels, but the pixel is not one of the second pixels;
assigning a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels;
assigning a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel;
assigning a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels;
averaging the assigned pixel values on a block by block basis within the frame to generate respective block scores;
compressing with the processor the video frame using coding parameters selected based on the block scores.
1 Assignment
0 Petitions
Accused Products
Abstract
In many videoconferencing applications, bandwidth is at a premium, and thus, it is important to encode a given video frame intelligently. It is often desirable that a larger amount of information be spent encoding the more important parts of the video frame, e.g., human facial features, whereas the less important parts of the video frame can be compressed at higher rates. Thus, there is need for an apparatus, computer readable medium, processor, and method for intelligent skin tone and facial feature aware videoconferencing compression that can “suggest” intelligent macroblock compression ratios to a video encoder. The suggestion of compression rates can be based at least in part on a determination of which macroblocks in a given video frame are likely to contain skin tones, likely to contain features (e.g., edges), likely to contain features in or near skin tone regions, or likely to contain neither skin tones nor features.
-
Citations
25 Claims
-
1. A processor programmed to perform a video compression method, the method comprising:
-
determining with the processor any first pixels in a frame having color in a predetermined tone region; determining with the processor any second pixels in the frame that are part of at least one facial feature; for each pixel in the frame; assigning a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any of the first pixels, but the pixel is not one of the second pixels; assigning a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels; assigning a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel; assigning a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels; averaging the assigned pixel values on a block by block basis within the frame to generate respective block scores; compressing with the processor the video frame using coding parameters selected based on the block scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 25)
-
-
18. An apparatus, comprising:
-
an image sensor for obtaining video data; memory operatively coupled to the image sensor; and a processor operatively coupled to the memory and the image sensor and programmed to encode the video data, the processor configured to; determine any first pixels in a frame having color in a predetermined tone region; determine any second pixels in the frame that are part of at least one facial feature; for each pixel in the frame; assign a first value to the respective pixel if a location of the pixel is within a specified threshold distance to any one of the first pixels, but the pixel is not one of the second pixels; assign a second value to the respective pixel if the pixel is one of the second pixels but is not located within a specified threshold distance to any of the first pixels; assign a third value to the respective pixel if the location of the pixel is within a specified threshold distance to any first pixel and the pixel is a second pixel; assign a fourth value to the respective pixel if the pixel is not located within a specified threshold distance to any first pixel and not one of the second pixels; average the assigned pixel values on a block by block basis within the frame to generate respective block scores; and compress the video frame using coding parameters based at least in part on the block scores. - View Dependent Claims (19, 20, 21, 22, 23, 24)
-
Specification