Face recognition in video content

US 8,494,231 B2
Filed: 11/01/2010
Issued: 07/23/2013
Est. Priority Date: 11/01/2010
Status: Active Grant

First Claim

Patent Images

1. In a computing environment, a method performed at least in part on at least one processor, comprising, receiving face detection data corresponding to a face detected in an input video frame, building face galleries, including grouping faces detected in input video frames into candidate groups based upon similarity data, filtering at least some faces from a candidate group, adding remaining faces to one of the face galleries, and labeling each face gallery with the face identification data, matching the face detection data against face identification data maintained in a face gallery among a plurality of face galleries to recognize the face in the input video frame, and generating metadata that associates the video frame and the face with the face identification data.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject disclosure relates to face recognition in video. Face detection data in frames of input data are used to generate face galleries, which are labeled and used in recognizing faces throughout the video. Metadata that associates the video frame and the face are generated and maintained for subsequent identification. Faces other than those found by face detection may be found by face tracking, in which facial landmarks found by the face detection are used to track a face over previous and/or subsequent video frames. Once generated, the maintained metadata may be accessed to efficiently determine the identity of a person corresponding to a viewer-selected face.

Citations

17 Claims

1. In a computing environment, a method performed at least in part on at least one processor, comprising, receiving face detection data corresponding to a face detected in an input video frame, building face galleries, including grouping faces detected in input video frames into candidate groups based upon similarity data, filtering at least some faces from a candidate group, adding remaining faces to one of the face galleries, and labeling each face gallery with the face identification data, matching the face detection data against face identification data maintained in a face gallery among a plurality of face galleries to recognize the face in the input video frame, and generating metadata that associates the video frame and the face with the face identification data.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 further comprising, tracking the face that is detected in the input video frame over one or more subsequent frames, and generating metadata that associates each of the one or more subsequent frames with the face identification data.
  - 3. The method of claim 1 further comprising, tracking the face that is detected in the input video frame over one or more previous frames, and generating metadata that associates each of the one or more previous frames with the face identification data.
  - 4. The method of claim 1 further comprising, dividing a candidate group into at least two candidate groups based upon the similarity data.
  - 5. The method of claim 1 further comprising, combining two or more candidate groups into a single candidate group based upon the similarity data.
  - 6. The method of claim 1 wherein filtering at least some of the faces from the candidate group comprises discarding a candidate group if any face in that group appears to not be of the same person.
  - 7. The method of claim 1 further comprising, receiving a request to identify a viewer-selected face, the request associated with a video frame number, accessing the metadata to determine whether face identification data exists for that viewer-selected face, and if so, returning information corresponding to the face identification data in response to the request.

8. In a computing environment, a system comprising, a face recognition pipeline that recognizes faces from input video, including a face grouping module configured to group faces into groups by similarity based upon face detection data provided by a face detection module data, the grouping module further configured to provide face galleries corresponding to the groups, including information that identifies each person associated with a face in a face gallery, the face recognition pipeline further comprising a face recognition mechanism that matches faces in the input video with faces in the face galleries to output information corresponding to recognized faces in the input video, wherein the input video comprises a full set of episodes, wherein the face grouping module groups faces using a lesser subset of the episodes, and wherein the face recognition mechanism matches faces for the full set of episodes.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8 further comprising a face tracking module configured to track a face in one or more frames adjacent a frame for which the face detection module data provided face detection data.
  - 10. The system of claim 9 wherein the face tracking module is configured to estimate and track facial landmarks in the one or more adjacent frames to track the face.
  - 11. The system of claim 9 wherein the face recognition mechanism matches faces tracked by the face tracking module to output at least some of the information corresponding to the recognized faces.
  - 12. The system of claim 8 wherein the information corresponding to recognized faces in the input video comprises metadata from which a person in a show or movie at a given frame and location in that frame is identifiable.
  - 13. The system of claim 12 further comprising a mechanism configured to access the metadata to identify a person given a show or movie, a given frame and a location in that frame.
  - 14. The system of claim 8 wherein the face detection module provides the data to the face grouping module for a sampling of less than the set of available frames of the input video.

15. One or more computer-readable storage media having computer-executable instructions, which when executed perform steps, comprising:
- receiving face detection data corresponding to a face detected in an input video frame, wherein the face detection data corresponds to similarity data;
  
  tracking the face in one or more adjacent video frames based on at least some of the face detection data to acquire a tracked face; and
  
  utilizing the similarity data to determine whether the tracked face matches a threshold level of similarity; and
  
  maintaining the tracked face in a face gallery in a single candidate group, wherein the tracked face from each video frame among the one or more adjacent video frames is maintained in the face gallery.
- View Dependent Claims (16, 17)
- - 16. The one or more computer-readable storage media of claim 15 wherein upon detecting an undetected face in the one or more adjacent video frames, a reverse order tracking of the one or more adjacent video frames is conducted, wherein the reverse order tracking comprises analyzing all video frames that appear prior to the video frame in which the undetected face is located.
  - 17. The one or more computer-readable storage media of claim 15 having further computer-executable instructions comprising, generating metadata that associates face identification data with the input video frame and each adjacent video frame.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Folta, Florin O., He, Yaming, Hor, King Wei, Shilotri, Minesh G., Spears, Stacey, Gu, Chuang
Primary Examiner(s)
Mehta, Bhavesh
Assistant Examiner(s)
Harandi, Siamak

Application Number

US12/916,895
Publication Number

US 20120106806A1
Time in Patent Office

995 Days
Field of Search

None
US Class Current

382/118
CPC Class Codes

G06V 40/173 face re-identification, e.g...

G06V 40/179 metadata assisted face reco...

Face recognition in video content

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Face recognition in video content

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links