Video retrieval system for human face content
First Claim
1. A method for processing video data, comprising:
- detecting human faces in a plurality of video frames in said video data;
for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
enabling a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for video retrieval and cueing that automatically detects human faces in the video and identifies face-specific video frames so as to allow retrieval and viewing of person-specific video segments. In one embodiment, the method locates human faces in the video, stores the time stamps associated with each face, displays a single image associated with each face, matches each face against a database, computes face locations with respect to a common 3D coordinate system, and provides a means of displaying: 1) information retrieved from the database associated with a selected person or people, 2) path of travel associated with a selected person or people 3) interaction graph of people in video, 4) video segments associated with each person and/or face. The method may also provide the ability to input and store text annotations associated with each person, face, and video segment, and the ability to enroll and remove people from database. The videos of non-human objects may be processed in a similar manner. Because of the rules governing abstracts, this abstract should not be used to construe the claims.
180 Citations
39 Claims
-
1. A method for processing video data, comprising:
-
detecting human faces in a plurality of video frames in said video data; for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and enabling a user to view face-specific video segments in said video data based on said face-specific set of video frames identified. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for processing video data, comprising:
-
detecting human faces in a plurality of video frames in said video data; indicating one or more unmatched human faces in said detected human faces based on a comparison of said detected human faces against a plurality of human face images stored in a database; and tracking at least one unmatched human face across said video data by locating a face-specific set of video frames therefor irrespective of whether said unmatched human face is present in said face-specific set of video frames in a substantially temporally continuous manner. - View Dependent Claims (20, 21, 22)
-
-
23. A method for processing video data, comprising:
-
detecting objects in a plurality of video frames in said video data; for at least one detected object, identifying an object-specific set of video frames irrespective of whether said detected object is present in said object-specific set of video frames in a substantially temporally continuous manner; and enabling a user to view object-specific video segments in said video data based on said object-specific set of video frames identified.
-
-
24. A method, comprising:
-
receiving video data from a user over a data communication network; detecting human faces in a plurality of video frames in said video data; for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; identifying those portions of said video data wherein said at least one detected human face is present; and sending cueing information for said portions of said video data to said user over said data communication network so as to enable said user to selectively view face-specific video segments in said video data associated with said at least one detected human face without a need to search said video data for said video segments. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A data storage medium containing a program code, which, when executed by a processor, causes said processor to perform the following:
-
receive video data; detect human faces in a plurality of video frames in said video data; for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified. - View Dependent Claims (30, 31, 32)
-
-
33. A system for processing video data, comprising:
-
means for detecting human faces in a plurality of video frames in said video data; for at least one detected human face, means for identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and means for displaying face-specific video segments in said video data based on said face-specific set of video frames identified. - View Dependent Claims (34)
-
-
35. A computer system, which, upon being programmed, is configured to perform the following:
-
receive video data; detect human faces in a plurality of video frames in said video data; for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.
-
-
36. A system for processing video data, comprising:
-
a computing unit; and a data storage medium containing a program code, which, when executed by said computing unit, causes said computing unit to perform the following; receive video data; detect human faces in a plurality of video frames in said video data; for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified. - View Dependent Claims (37)
-
-
38. A system for processing video data, comprising:
-
a video data source connected to a communication network, wherein said video data source is configured to transmit video data over said communication network; and a computing unit in communication with said video data source and connected to said communication network, wherein said computing unit is configured to perform the following; receive said video data from said video data source transmitted over said communication network; detect human faces in a plurality of video frames in said video data; for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and send cueing information for said face-specific set of video frames to said user over said data communication network so as to enable said user to selectively view face-specific video segments in said video data associated with said at least one detected human face without a need to search said video data for said video segments. - View Dependent Claims (39)
-
Specification