Method and apparatus for real-time face-tracking and face-pose-selection on embedded vision systems
First Claim
1. A method for performing real-time face-pose-estimation and best-pose selection for a detected person captured in a video, the method comprising:
- receiving a video image among a sequence of video frames of a video;
performing a face detection operation on the video image to detect a set of faces in the video image;
detecting a new person appears in the video based on the set of detected faces;
tracking the new person through subsequent video images in the video by detecting a sequence of face images of the new person in the subsequent video images;
for each of the subsequent video images which contains a detected face of the new person being tracked;
estimating a pose associated with the detected face; and
updating a best pose for the new person based on the estimated pose; and
upon detecting that the new person has disappeared from the video, transmitting a detected face of the new person corresponding to the current best pose to a server, wherein transmitting the detected face having the best pose among the sequence of detected face images reduces network bandwidth and improves storage efficiency.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments described herein provide various examples of a real-time face-detection, face-tracking, and face-pose-selection subsystem within an embedded video system. In one aspect, a process for performing real-time face-pose-estimation and best-pose selection for a detected person captured in a video is disclosed. This process includes the steps of: receiving a video image among a sequence of video frames of a video; performing a face detection operation on the video image to detect a set of faces in the video image; detecting a new person appears in the video based on the set of detected faces; tracking the new person through subsequent video images in the video by detecting a sequence of face images of the new person in the subsequent video images; and for each of the subsequent video images which contains a detected face of the new person being tracked: estimating a pose associated with the detected face and updating a best pose for the new person based on the estimated pose. Upon detecting that the new person has disappeared from the video, the process then transmits a detected face of the new person corresponding to the current best pose to a server, wherein transmitting the detected face having the best pose among the sequence of detected face images reduces network bandwidth and improves storage efficiency.
-
Citations
20 Claims
-
1. A method for performing real-time face-pose-estimation and best-pose selection for a detected person captured in a video, the method comprising:
-
receiving a video image among a sequence of video frames of a video; performing a face detection operation on the video image to detect a set of faces in the video image; detecting a new person appears in the video based on the set of detected faces; tracking the new person through subsequent video images in the video by detecting a sequence of face images of the new person in the subsequent video images; for each of the subsequent video images which contains a detected face of the new person being tracked; estimating a pose associated with the detected face; and updating a best pose for the new person based on the estimated pose; and upon detecting that the new person has disappeared from the video, transmitting a detected face of the new person corresponding to the current best pose to a server, wherein transmitting the detected face having the best pose among the sequence of detected face images reduces network bandwidth and improves storage efficiency. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system for performing real-time face-pose-estimation and best-pose selection for a detected person captured in a video, the system comprising:
-
a receiving module configured to receive a video image among a sequence of video frames of a video; a face detection module configured to; detect a face detection operation on the video image to detect a set of faces in the video image; and detect a new person appears in the video based on the set of detected faces; a face tracking module configured to track the new person through subsequent video images in the video by detecting a sequence of face images of the new person in the subsequent video images; and a face-pose-selection module configured to, for each of the subsequent video images which contains a detected face of the new person being tracked; estimate a pose associated with the detected face; update a best pose for the new person based on the estimated pose; and upon detecting that the new person has disappeared from the video, transmit a detected face of the new person corresponding to the current best pose to a server, wherein transmitting the detected face having the best pose among the sequence of detected face images reduces network bandwidth and improves storage efficiency.
-
-
19. An embedded system capable of performing real-time face-pose-estimation and best-pose selection for a detected person captured in a video, the embedded system comprising:
-
a processor; a memory coupled to the processor; an image capturing device coupled to the processor and the memory and configured to capture a video; a receiving module configured to receive a video image among a sequence of video frames of a video; a face detection module configured to; detect a face detection operation on the video image to detect a set of faces in the video image; and detect a new person appears in the video based on the set of detected faces; a face tracking module configured to track the new person through subsequent video images in the video by detecting a sequence of face images of the new person in the subsequent video images; and a face-pose-selection module configured to, for each of the subsequent video images which contains a detected face of the new person being tracked; estimate a pose associated with the detected face; update a best pose for the new person based on the estimated pose; and upon detecting that the new person has disappeared from the video, transmit a detected face of the new person corresponding to the current best pose to a server, wherein transmitting the detected face having the best pose among the sequence of detected face images reduces network bandwidth and improves storage efficiency. - View Dependent Claims (20)
-
Specification