Automatic face and facial feature location detection for low bit rate model-assisted H.261 compatible coding of video
First Claim
Patent Images
1. An apparatus for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the apparatus comprising:
- a processor for processing the video signal to detect at least a region of a head outline of the human subject characterized by at least a portion of an ellipse, and to generate a plurality of parameters associated with the ellipse for use in coding the video signal;
wherein said processor is adapted to define a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse, and to determine an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region of the subject.
5 Assignments
0 Petitions
Accused Products
Abstract
An apparatus responds to a video signal representing a succession of frames, where at least one of the frames corresponds to an image of an object, to detect at least a region of the object. The apparatus includes a processor for processing the video signal to detect at least the region of the object characterized by at least a portion of a closed curve and to generate a plurality of parameters associated with the closed curve for use in coding the video signal.
312 Citations
21 Claims
-
1. An apparatus for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the apparatus comprising:
-
a processor for processing the video signal to detect at least a region of a head outline of the human subject characterized by at least a portion of an ellipse, and to generate a plurality of parameters associated with the ellipse for use in coding the video signal; wherein said processor is adapted to define a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse, and to determine an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region of the subject. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the method comprising the steps of:
-
detecting at least a region of a head outline of the human subject characterized by at least a portion of an ellipse; generating a plurality of parameters associated with the ellipse; coding the video signal using the plurality of parameters; defining a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse; and determining an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A coding controller responsive to an object detection signal for controlling the coding of a video signal representing a succession of frames, at least one of the frames including an image of an object, the object detection signal indicating a detected region of the object, the coding controller comprising:
a processor for performing buffer size modulation and responsive to the object detection signal for adjusting a quantizer step size used in the coding of the video signal to increase a buffer size for coding of the detected region of the object, the coding being based on quantization of discrete cosine transform (DCT) coefficients, wherein said buffer size modulation is performed in accordance with the following equation;
##EQU15## where Qpi is an updated quantizer step size for a particular macroblock i, Bi is an output buffer occupancy prior to coding macroblock i, f is a function of the input video signal, μ
i is a modulation factor for macroblock i, and ζ
(i) is a region index function that associates a position of macroblock i with a region in which macroblock i belongs, wherein a macroblock is defined to belong to a region if at least one of its pixels is inside that region.
-
21. A coding controller responsive to an object detection signal for controlling the coding of a video signal representing a succession of frames, at least one of the frames including an image of an object, the object detection signal indicating a detected region of the object, the coding controller comprising:
a processor for performing buffer rate modulation and responsive to the object detection signal for adjusting a quantizer step size used in the coding of the video signal to increase a rate of coding of the detected region of the object, the coding being based on quantization of discrete cosine transform (DCT) coefficients, wherein the buffer rate modulation is based on the following equation;
space="preserve" listing-type="equation">B.sub.i =B.sub.i-1 +C.sub.ζ
(i-1) (i-1)-γ
.sub.ζ
(i) t,where Bi and Bi-1 are output buffer occupancies prior to coding macroblocks i and i-1, respectively, t is an average target rate in bits per macroblock, ζ
(i) is a region index function associating a position of macroblock i with a region in which macroblock i belongs, where a macroblock is defined to belong to a region if at least one of its pixels is inside that region, the function C.sub.ζ
(i-1) (i-1) is the number of bits spent to code the (i-1)st macroblock in a particular region associated with ζ
(i-1) and immediately preceding overhead information thereof, and γ
is a parameter which is greater than one in object regions of the image associated with the region index function.
Specification