Video retrieval system for human face content

US 20080080743A1
Filed: 09/29/2006
Published: 04/03/2008
Est. Priority Date: 09/29/2006
Status: Active Grant

First Claim

Patent Images

1. A method for processing video data, comprising:

detecting human faces in a plurality of video frames in said video data;

for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and

enabling a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for video retrieval and cueing that automatically detects human faces in the video and identifies face-specific video frames so as to allow retrieval and viewing of person-specific video segments. In one embodiment, the method locates human faces in the video, stores the time stamps associated with each face, displays a single image associated with each face, matches each face against a database, computes face locations with respect to a common 3D coordinate system, and provides a means of displaying: 1) information retrieved from the database associated with a selected person or people, 2) path of travel associated with a selected person or people 3) interaction graph of people in video, 4) video segments associated with each person and/or face. The method may also provide the ability to input and store text annotations associated with each person, face, and video segment, and the ability to enroll and remove people from database. The videos of non-human objects may be processed in a similar manner. Because of the rules governing abstracts, this abstract should not be used to construe the claims.

180 Citations

39 Claims

1. A method for processing video data, comprising:
- detecting human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  enabling a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method of claim 1, further comprising grouping all video frames in each said face-specific set of video frames to facilitate viewing of said face-specific video segments.
  - 3. The method of claim 2, wherein said grouping is carried out in a temporally sequential manner based on respective time stamps associated with said video frames in each said face-specific set of video frames.
  - 4. The method of claim 2, further comprising:
    - displaying a representative image for said grouped video frames.
  - 5. The method of claim 2, further comprising:
    - allowing said user to manually associate respective grouped video frames in said face-specific set of video frames with an image entry stored in a database.
  - 6. The method of claim 2, further comprising:
    - allowing said user to manually override a match between respective grouped video frames in said face-specific set of video frames and an image entry stored in a database.
  - 7. The method of claim 2, further comprising:
    - matching all grouped video frames with image entries stored in a database; and
      
      enrolling unmatched grouped video frames into said database through corresponding image entries.
  - 8. The method of claim 1, further comprising:
    - indicating one or more unmatched human faces in said detected human faces based on a comparison of said detected human faces against a plurality of human face images stored in a database; and
      
      enabling a user to view those face-specific video segments wherein said one or more unmatched human faces are present.
  - 9. The method of claim 1, further comprising:
    - displaying a representative image for at least one video frame in said face-specific set of video frames for said at least one detected human face.
  - 10. The method of claim 9, further comprising:
    - enabling said user to view said face-specific video segments using said representative image as a link therefor.
  - 11. The method of claim 9, further comprising:
    - retrieving a textual description for said face-specific video segments from a database; and
      
      displaying said textual description along with said representative image.
  - 12. The method of claim 1, further comprising:
    - enabling said user to input a textual description of said face-specific video segments associated with said at least one detected human face.
  - 13. The method of claim 1, wherein said identifying includes using face recognition to identify said face-specific set of video frames for said at least one detected human face.
  - 14. The method of claim 1, further comprising:
    - automatically displaying said face-specific video segments upon identification of said face-specific set of video frames for said at least one detected human face.
  - 15. The method of claim 1, further comprising:
    - determining movement of said at least one detected human face in said face-specific video segments associated therewith using a three-dimensional coordinate system.
  - 16. The method of claim 15, further comprising:
    - displaying said movement of said at least one detected human face with respect to a map.
  - 17. The method of claim 1, further comprising:
    - displaying a co-occurrence of two human faces in said plurality of video frames as a link graph.
  - 18. The method of claim 17, wherein said link graph includes a plurality of dimensionally-weighted links.

19. A method for processing video data, comprising:
- detecting human faces in a plurality of video frames in said video data;
  
  indicating one or more unmatched human faces in said detected human faces based on a comparison of said detected human faces against a plurality of human face images stored in a database; and
  
  tracking at least one unmatched human face across said video data by locating a face-specific set of video frames therefor irrespective of whether said unmatched human face is present in said face-specific set of video frames in a substantially temporally continuous manner.
- View Dependent Claims (20, 21, 22)
- - 20. The method of claim 19, wherein said tracking is performed in real time.
  - 21. The method of claim 19, further comprising:
    - automatically displaying face-specific video segments associated with said at least one unmatched human face based on said face-specific set of video frames located therefor.
  - 22. The method of claim 19, wherein said face-specific set of video frames is located using face recognition, and wherein said method further comprises:
    - grouping all video frames in said face-specific set of video frames located for said at least one unmatched human face; and
      
      displaying a representative image for at least one video frame in said face-specific set of video frames.

23. A method for processing video data, comprising:
- detecting objects in a plurality of video frames in said video data;
  
  for at least one detected object, identifying an object-specific set of video frames irrespective of whether said detected object is present in said object-specific set of video frames in a substantially temporally continuous manner; and
  
  enabling a user to view object-specific video segments in said video data based on said object-specific set of video frames identified.

24. A method, comprising:
- receiving video data from a user over a data communication network;
  
  detecting human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner;
  
  identifying those portions of said video data wherein said at least one detected human face is present; and
  
  sending cueing information for said portions of said video data to said user over said data communication network so as to enable said user to selectively view face-specific video segments in said video data associated with said at least one detected human face without a need to search said video data for said video segments.
- View Dependent Claims (25, 26, 27, 28)
- - 25. The method of claim 24, further comprising:
    - indicating one or more unmatched human faces in said detected human faces;
      
      identifying only those portions of said video data wherein said one or more unmatched human faces are present; and
      
      sending cueing information for only said video portions associated with said one or more unmatched human faces to said user over said data communication network.
  - 26. The method of claim 24, wherein said cueing information includes said face-specific video segments associated with only those of said detected human faces that are unmatched based on a database query.
  - 27. The method of claim 24, further comprising:
    - charging a fee to said user for sending said cueing information.
  - 28. The method of claim 24, wherein said data communication network is the Internet.

29. A data storage medium containing a program code, which, when executed by a processor, causes said processor to perform the following:
- receive video data;
  
  detect human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.
- View Dependent Claims (30, 31, 32)
- - 30. The data storage medium of claim 29, wherein said program code, upon execution by said processor, causes said processor to further perform the following:
    - indicate one or more unmatched human faces in said detected human faces based on a comparison of said detected human faces against a plurality of human face images stored in a database; and
      
      track at least one unmatched human face across said video data in substantially real time through said face-specific set of video frames therefor.
  - 31. The data storage medium of claim 30, wherein said program code, upon execution by said processor, causes said processor to further perform the following:
    - automatically display face-specific video segments associated with said at least one unmatched human face based on said face-specific set of video frames therefor.
  - 32. The data storage medium of claim 30, wherein said program code, upon execution by said processor, causes said processor to further perform the following:
    - display a cueing link for said face-specific set of video frames associated with said at least one unmatched human face so as to enable said user to view only those face-specific video segments in said video data wherein said at least one unmatched human face appears without requiring said user to search said video data for said video segments of said at least one unmatched human face.

33. A system for processing video data, comprising:
- means for detecting human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, means for identifying a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  means for displaying face-specific video segments in said video data based on said face-specific set of video frames identified.
- View Dependent Claims (34)
- - 34. The system of claim 33, further comprising:
    - means for indicating one or more unmatched human faces in said detected human faces;
      
      means for identifying those portions of said video data wherein said one or more unmatched human faces are present; and
      
      means for automatically displaying face-specific video segments in said video data associated with said one or more unmatched human faces based on said video data portions identified for said one or more unmatched human faces.

35. A computer system, which, upon being programmed, is configured to perform the following:
- receive video data;
  
  detect human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.

36. A system for processing video data, comprising:
- a computing unit; and
  
  a data storage medium containing a program code, which, when executed by said computing unit, causes said computing unit to perform the following;
  
  receive video data;
  
  detect human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  enable a user to view face-specific video segments in said video data based on said face-specific set of video frames identified.
- View Dependent Claims (37)
- - 37. The system of claim 36, further comprising:
    - a video data source to provide said video data, wherein said video data source is one of the following;
      
      a portion of said computing unit configured to record said video data; and
      
      a video camera coupled to said computing unit.

38. A system for processing video data, comprising:
- a video data source connected to a communication network, wherein said video data source is configured to transmit video data over said communication network; and
  
  a computing unit in communication with said video data source and connected to said communication network, wherein said computing unit is configured to perform the following;
  
  receive said video data from said video data source transmitted over said communication network;
  
  detect human faces in a plurality of video frames in said video data;
  
  for at least one detected human face, identify a face-specific set of video frames irrespective of whether said detected human face is present in said face-specific set of video frames in a substantially temporally continuous manner; and
  
  send cueing information for said face-specific set of video frames to said user over said data communication network so as to enable said user to selectively view face-specific video segments in said video data associated with said at least one detected human face without a need to search said video data for said video segments.
- View Dependent Claims (39)
- - 39. The system of claim 38, wherein said video data source is at least one of the following:
    - a computing unit having a built-in means to record said video data;
      
      a video camera; and
      
      a computing unit having said video data stored therein prior to transmission over said communication network.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Pittsburgh Pattern Recognition, Inc. (Alphabet Inc.)
Inventors
Rodriguez, Uriel G., Brandy, Louis D., Nechyba, Michael C., Schneiderman, Henry

Granted Patent

US 7,881,505 B2
Time in Patent Office

Days
Field of Search
US Class Current

382/118
CPC Class Codes

G06F 16/784   the detected or recognised ...

G06V 40/173   face re-identification, e.g...

G08B 13/196   using television cameras

G11B 27/105   of operating discs

G11B 27/28   by using information signal...

G11B 27/3027   used signal is digitally coded

G11B 27/34   Indicating arrangements in...

Video retrieval system for human face content

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

180 Citations

39 Claims

Specification

Solutions

Use Cases

Quick Links

Video retrieval system for human face content

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

180 Citations

39 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links