Automatic detection and tracking of multiple individuals using multiple cues
First Claim
Patent Images
1. A method comprising:
- receiving a frame of content;
automatically detecting a candidate area for a new face region in the frame;
using one or more hierarchical verification levels to verify whether a human face is in the candidate area;
indicating that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area; and
using a plurality of cues to track each verified face in the content from frame to frame.
3 Assignments
0 Petitions
Accused Products
Abstract
Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.
-
Citations
71 Claims
-
1. A method comprising:
-
receiving a frame of content;
automatically detecting a candidate area for a new face region in the frame;
using one or more hierarchical verification levels to verify whether a human face is in the candidate area;
indicating that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area; and
using a plurality of cues to track each verified face in the content from frame to frame. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A system to track multiple individuals in video content, the system comprising:
-
an auto-initialization module to detect a candidate region for a new face in a frame of the video content;
a hierarchical verification module to generate a confidence level for the candidate region; and
a multi-cue tracking module to use a plurality of visual cues to track previous candidate regions with confidence levels, generated by the hierarchical verification module, that exceeded a threshold value. - View Dependent Claims (32, 33, 34, 35, 36)
-
-
37. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors, causes the one or more processors to:
-
receive an indication of an area of a frame of video content;
use a first verification process to determine whether a human head is in the area; and
if the first verification process verifies that the human head is in the area, then indicate the area includes a face, and otherwise use a second verification process to determine whether the human head is in the area. - View Dependent Claims (38, 39, 40, 41, 42, 43)
-
-
44. One or more computer readable media having stored thereon a plurality of instructions to detect a candidate region for an untracked face in a frame of content, wherein the plurality of instructions, when executed by one or more processors, causes the one or more processors to:
-
detect whether there is motion in the frame;
if there is motion in the frame, then perform motion-based initialization to identify the candidate region;
detect whether there is audio in the frame;
if there is audio in the frame, then perform audio-based initialization to identify the candidate region; and
if there is neither motion in the frame nor audio in the frame, then use a fast face detector to identify the candidate region. - View Dependent Claims (45, 46)
-
-
47. One or more computer readable media having stored thereon a plurality of instructions to track faces from frame to frame of content, wherein the plurality of instructions, when executed by one or more processors, causes the one or more processors to:
-
predict, using a plurality of cues, where a contour of a face will be in a frame;
encode a smoothness constraint that penalizes roughness;
apply the smoothness constraint to a plurality of possible contour locations; and
select the contour location having the smoothest contour as the location of the face in the frame. - View Dependent Claims (48, 49, 50, 51, 52, 53, 54, 55, 56, 57)
-
-
58. A method for tracking an object along frames of content, the method comprising:
using a plurality of cues to track the object. - View Dependent Claims (59, 60)
-
61. A method for tracking an object along frames of content, the method comprising:
-
predicting where the object will be in a frame;
encoding a smoothness constraint that penalizes roughness;
applying the smoothness constraint to a plurality of possible object locations; and
selecting the object location having the smoothest contour as the location of the object in the frame. - View Dependent Claims (62, 63, 64, 65, 66, 67, 68, 69, 70, 71)
-
Specification