Automatic face and facial feature location detection for low bit rate model-assisted H.261 compatible coding of video

US 5,852,669 A
Filed: 07/10/1995
Issued: 12/22/1998
Est. Priority Date: 04/06/1994
Status: Expired due to Term

First Claim

Patent Images

1. An apparatus for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the apparatus comprising:

a processor for processing the video signal to detect at least a region of a head outline of the human subject characterized by at least a portion of an ellipse, and to generate a plurality of parameters associated with the ellipse for use in coding the video signal;

wherein said processor is adapted to define a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse, and to determine an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region of the subject.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus responds to a video signal representing a succession of frames, where at least one of the frames corresponds to an image of an object, to detect at least a region of the object. The apparatus includes a processor for processing the video signal to detect at least the region of the object characterized by at least a portion of a closed curve and to generate a plurality of parameters associated with the closed curve for use in coding the video signal.

312 Citations

21 Claims

1. An apparatus for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the apparatus comprising:
- a processor for processing the video signal to detect at least a region of a head outline of the human subject characterized by at least a portion of an ellipse, and to generate a plurality of parameters associated with the ellipse for use in coding the video signal;
  
  wherein said processor is adapted to define a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse, and to determine an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region of the subject.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The apparatus of claim 1 wherein the processor further includes:
    - a preprocessor for preprocessing the video signal to generate an edge data signal corresponding to an edge of the head outline; and
      
      an object detector for processing the edge data signal to generate the plurality of parameters.
  - 3. The apparatus of claim 1 further including:
    - a coding controller, responsive to an object detection signal relating to the parameters, for performing buffer size modulation to adjust a quantizer step size used in the coding of the video signal to increase a buffer size for controlling the coding of the detected region of the object.
  - 4. The apparatus of claim 1 further including:
    - a coding controller, responsive to an object detection signal relating to the parameters, for performing buffer rate modulation to adjust a quantizer step size used in the coding of the video signal to increase a rate of coding for controlling the coding of the detected region of the object.
  - 5. The apparatus of claim 1 wherein said processor further defines a trapezoidal region corresponding substantially to the eyes-nose-mouth region of the human subject, and further including a coding controller to refine quantization of image data in the trapezoidal region.
  - 6. The apparatus of claim 1 wherein said window is defined as having a common center with said ellipse.
  - 7. The apparatus of claim 1 wherein said processor includes means for detecting edge data of the eyes of the subject and said processor determines the amount of slant based upon said edge data.
  - 8. The apparatus of claim 7 wherein the processor defines an upper rectangular region, within the rectangular window, corresponding to at least the eyes of the subject to facilitate determination of a symmetry value for the window and of said amount of slant.
  - 9. The apparatus of claim 1 wherein the processor determines the amount of slant as one of a predetermined number of discrete slant angles.

10. A method for coding a video signal representing a succession of frames, at least one of the frames including an image of a human subject, the method comprising the steps of:
- detecting at least a region of a head outline of the human subject characterized by at least a portion of an ellipse;
  
  generating a plurality of parameters associated with the ellipse;
  
  coding the video signal using the plurality of parameters;
  
  defining a rectangular window encompassing an eyes-nose-mouth region of the subject as a function of the ellipse; and
  
  determining an amount of slant of a symmetrical axis of the window with respect to a vertical axis of the image to enhance detection and image quality of the eyes-nose-mouth region.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 11. The method of claim 10 wherein the step of generating the plurality of parameters includes the step of:
    - preprocessing the video signal to generate an edge data signal corresponding to an edge of the region of the human subject; and
      
      wherein the step of detecting includes the step of processing the edge data signal to detect at least the region of the human subject characterized by at least a portion of a closed curve.
  - 12. The method of claim 10 further including the step of:
    - adjusting a quantizer step size in response to a human subject detection signal; and
      
      coding the video signal using the adjusted quantizer step size.
  - 13. The method of claim 12 further including the step of:
    - increasing a buffer size for coding of the detected region of the human subject.
  - 14. The method of claim 12 further including the step of:
    - increasing a rate of coding of the detected region of the human subject.
  - 15. The method of claim 10 further comprising the steps of:
    - defining a trapezoidal region corresponding substantially to the eyes-nose-mouth region of the human subject; and
      
      refining quantization of image data in the trapezoidal region.
  - 16. The method of claim 10 wherein said step of defining a rectangular window includes defining said window as having a common center with said ellipse.
  - 17. The method of claim 10, further comprising the step of detecting edge data of the eyes of the subject, and wherein the amount of slant is determined based on the detected edge data.
  - 18. The method of claim 17, further including the step of defining an upper rectangular region, within the rectangular window, corresponding to at least the eyes of the subject to facilitate determination of said amount of slant.
  - 19. The method of claim 10 further including determining the amount of slant as one of a predetermined number of discrete slant angles.

20. A coding controller responsive to an object detection signal for controlling the coding of a video signal representing a succession of frames, at least one of the frames including an image of an object, the object detection signal indicating a detected region of the object, the coding controller comprising:
- a processor for performing buffer size modulation and responsive to the object detection signal for adjusting a quantizer step size used in the coding of the video signal to increase a buffer size for coding of the detected region of the object, the coding being based on quantization of discrete cosine transform (DCT) coefficients, wherein said buffer size modulation is performed in accordance with the following equation;
  
  ##EQU15## where Qp_i is an updated quantizer step size for a particular macroblock i, B_i is an output buffer occupancy prior to coding macroblock i, f is a function of the input video signal, μ
  
  _i is a modulation factor for macroblock i, and ζ
  
  (i) is a region index function that associates a position of macroblock i with a region in which macroblock i belongs, wherein a macroblock is defined to belong to a region if at least one of its pixels is inside that region.

21. A coding controller responsive to an object detection signal for controlling the coding of a video signal representing a succession of frames, at least one of the frames including an image of an object, the object detection signal indicating a detected region of the object, the coding controller comprising:
- a processor for performing buffer rate modulation and responsive to the object detection signal for adjusting a quantizer step size used in the coding of the video signal to increase a rate of coding of the detected region of the object, the coding being based on quantization of discrete cosine transform (DCT) coefficients, wherein the buffer rate modulation is based on the following equation;
  space="preserve" listing-type="equation">B.sub.i =B.sub.i-1 +C.sub.ζ
  
  (i-1) (i-1)-γ
  
  .sub.ζ
  
  (i) t,
  where B_i and B_i-1 are output buffer occupancies prior to coding macroblocks i and i-1, respectively, t is an average target rate in bits per macroblock, ζ
  
  (i) is a region index function associating a position of macroblock i with a region in which macroblock i belongs, where a macroblock is defined to belong to a region if at least one of its pixels is inside that region, the function C.sub.ζ
  
  (i-1) (i-1) is the number of bits spent to code the (i-1)^st macroblock in a particular region associated with ζ
  
  (i-1) and immediately preceding overhead information thereof, and γ
  
  is a parameter which is greater than one in object regions of the image associated with the region index function.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Eleftheriadis, Alexandros, Jacquin, Arnaud Eric
Primary Examiner(s)
Moore, David K.
Assistant Examiner(s)
Werner, Brian P.

Application Number

US08/500,672
Time in Patent Office

1,261 Days
Field of Search

382/232, 382/239, 382/251, 382/118, 382/173, 382/236, 382/250, 382/291, 348/169, 348/172, 348/405, 348/419, 348/13, 348/14, 348/15, 348/16, 348/17, 348/18, 348/19, 348/26, 348/27, 348/390, 348/397, 348/404, 358/430, 395/200.34
US Class Current

382/118
CPC Class Codes

G06V 40/161   Detection; Localisation; No...

G10L 19/012   Comfort noise or silence co...

H04N 19/10   using adaptive coding

H04N 19/107   between spatial and tempora...

H04N 19/115   Selection of the code volum...

H04N 19/126   Details of normalisation or...

H04N 19/132   Sampling, masking or trunca...

H04N 19/136   Incoming video signal chara...

H04N 19/146   Data rate or code amount at...

H04N 19/149   by estimating the code amou...

H04N 19/15   by monitoring actual compre...

H04N 19/152   by measuring the fullness o...

H04N 19/154   Measured or subjectively es...

H04N 19/17   the unit being an image reg...

H04N 19/176   the region being a block, e...

H04N 19/20   using video object coding

H04N 19/61   in combination with predict...

H04N 19/63   using sub-band based transf...

H04N 19/85   using pre-processing or pos...

Automatic face and facial feature location detection for low bit rate model-assisted H.261 compatible coding of video

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

312 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Automatic face and facial feature location detection for low bit rate model-assisted H.261 compatible coding of video

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

312 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links