Keyframe selection to represent a video

US 6,711,587 B1
Filed: 09/05/2000
Issued: 03/23/2004
Est. Priority Date: 09/05/2000
Status: Expired due to Term

First Claim

Patent Images

1. A method of extracting a single representative key frame from a sequence of frames, the sequence of frames including a plurality of shots, comprising the steps of:

performing face detection in the sequence of frames comprising the steps of;

creating a set of images for each frame in the sequence of frames with each image in the set of images smaller than the previous image; and

searching for faces having at least a minimum size in a selected portion of the set of images;

detecting shot boundaries in the sequence of frames to identify shots within the detected shot boundaries;

selecting a most interesting shot from the identified shots based on a number of detected faces in the shot; and

selecting the single representative key frame representative of the sequence of frames from the selected shot based on a number of detected faces in the frame.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A key frame representative of a sequence of frames in a video file is selected by applying face detection to a video to select a key frame which may include people and has particular application to indexing video files located by a search engine web crawler. A key frame, one frame representative of a video file, is extracted from the sequence of frames. The sequence of frames may include multiple scenes or shots, for example, continuous motions relative to a camera separated by transitions, cuts, fades and dissolves. To extract a key frame face detection is performed in each frame and a key frame is selected from the sequence of frames based on a sum of detected faces in the frame.

Citations

29 Claims

1. A method of extracting a single representative key frame from a sequence of frames, the sequence of frames including a plurality of shots, comprising the steps of:
- performing face detection in the sequence of frames comprising the steps of;
  
  creating a set of images for each frame in the sequence of frames with each image in the set of images smaller than the previous image; and
  
  searching for faces having at least a minimum size in a selected portion of the set of images;
  
  detecting shot boundaries in the sequence of frames to identify shots within the detected shot boundaries;
  
  selecting a most interesting shot from the identified shots based on a number of detected faces in the shot; and
  
  selecting the single representative key frame representative of the sequence of frames from the selected shot based on a number of detected faces in the frame.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 wherein the selected portion of the set of images is based on the minimum size face to be detected.
  - 3. The method as claimed in claim 1 wherein the images are smaller by the same scale factor.
  - 4. The method as claimed in claim 3 further comprising the step of:
5. The method as claimed in claim 1 further comprising the step of:
- tracking overlap of a detected face in consecutive frames in order to filter detected faces which are not likely to be valid.
6. The method as claimed in claim 1 wherein the step of selecting a most interesting shot includes providing a shot score based on a set of measures selected from the group consisting of motion between frames, amount of skin color pixels, shot length and detected faces.
7. The method as claimed in claim 6 wherein each measure includes a respective weighting factor.
8. The method as claimed in claim 7 wherein the weighting factor is dependent on the level of confidence of the measure.
9. The method as claimed in claim 1 wherein the step of performing face detection uses a neural network-based algorithm.

10. An apparatus for extracting a single representative key frame from a sequence of frames comprising:
- means for performing face detection in the sequence of frames, the means for performing comprising;
  
  means for creating a set of images for the frame with each image in the set of images smaller than the previous image; and
  
  means for searching for faces having at least a minimum size in a selected portion of the set of images;
  
  means for detecting shot boundaries in the sequence of frames to identify shots within shot boundaries;
  
  means for selecting a most interesting shot from the identified shots based on a number of detected faces in the shot; and
  
  means for selecting the single representative key frame representative of the sequence of frames from the selected shot based on a number of detected faces in the frame.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The apparatus as claimed in claim 10 wherein the selected portion of the set of images is based on the minimum size face to be detected.
  - 12. The apparatus as claimed in claim 10 wherein the images are smaller by the same scale factor.
  - 13. The apparatus as claimed in claim 12 further comprising:
14. The apparatus as claimed in claim 10 further comprising:
- means for tracking overlap of a detected face in consecutive frames to filter detected faces which are not likely to be valid.
15. The apparatus as claimed in claim 10 wherein the means for selecting a most interesting shot comprises:
- means for providing a shot score based on a set of measures selected from the group consisting of motion between frames, amount of skin color pixels, shot length and detected faces.
16. The apparatus as claimed in claim 15 wherein each measure includes a respective weighting factor.
17. The apparatus as claimed in claim 16 wherein the weighting factor is dependent on the level of confidence of the measure.
18. The apparatus as claimed in claim 10 wherein the means for performing face detection uses a neural network-based algorithm.

19. An apparatus for extracting a single representative key frame from a sequence of frames comprising:
- a face detector which performs face detection in the sequence of frames the face detector including;
  
  an image creator which creates a set of images for the frame with each image in the set of images smaller than the previous image; and
  
  a face searcher which searches for faces having at least a minimum size in a selected portion of the set of images; and
  
  a key frame selector which selects a key frame representative of the sequence of frames from the sequence of frames based on a number of detected faces in the frame.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
- - 20. The apparatus as claimed in claim 19 wherein the selected portion of the set of images is based on the size of the face to be detected.
  - 21. The apparatus as claimed in claim 19 wherein the images are smaller by the same scale factor.
  - 22. The apparatus as claimed in claim 21 further comprising:
23. The apparatus as claimed in claim 19 further comprising:
- a face tracker which tracks a detected face through consecutive frames to filter detected faces which are not likely to be valid.
24. The apparatus as claimed in claim 19 wherein the key shot detector comprises:
- a shot score generator which generates a shot score for based on a set of measures selected from the group consisting of motion between frames, amount of skin color pixels, shot length and detected faces.
25. The apparatus as claimed in claim 24 wherein each measure includes a respective weighting factor.
26. The apparatus as claimed in claim 25 wherein the weighting factor is dependent on the level of confidence of the measure.
27. The apparatus as claimed in claim 19 wherein the face detector uses a neural network-based algorithm.

28. A computer system comprising:
- a memory system storing a sequence of frames; and
  
  a face detector which performs face detection in the sequence of frames, the face detector comprising;
  
  an image creator which creates a set of images for the frame with each image in the set of images smaller than the previous image; and
  
  a face searcher which searches for faces having at least a minimum size in a selected portion of the set of images;
  
  a shot boundary detector which detects shot boundaries to identify shots within the detected shot boundaries; and
  
  a key shot selector which selects a most interesting shot from the identified shots based on a number of detected faces in the shot; and
  
  a key frame selector which selects the single representative key frame representative of the sequence of frames from the selected shot based on a number of detected faces in the frame.

29. An article of manufacture comprising:
- a computer-readable medium for use in a computer having a memory;
  
  a computer-implementable software program recorded on the medium for extracting a single representative key frame from a sequence of frames, the sequence of frames including a plurality of shots, the computer implemented software program comprising instructions for;
  
  performing face detection in the sequence of frames comprising the steps of;
  
  creating a set of images for each frame in the sequence of frames with each image in the set of images smaller than the previous image; and
  
  searching for faces having at least a minimum size in a selected portion of the set of images;
  
  detecting shot boundaries in the sequence of frames to identify shots within the detected shot boundaries;
  
  selecting a most interesting shot from the identified shots based on a number of detected faces in the shot; and
  
  selecting the single representative key frame representative of the sequence of frames from the selected shot based on a number of detected faces in the frame.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Inventors
Dufaux, Frederic
Primary Examiner(s)
Mirzahi, Diane D.
Assistant Examiner(s)
MOFIZ, APU M

Application Number

US09/654,302
Time in Patent Office

1,295 Days
Field of Search

707/6, 707/10, 707/104.1, 382/103, 382/118
US Class Current

1/1
CPC Class Codes

G06F 16/71   Indexing; Data structures t...

G06F 16/739   in form of a video summary,...

G06F 16/7834   using audio features

G06V 20/40   in video content extracting...

Y10S 707/99936   Pattern matching access

Y10S 707/99945   Object-oriented database st...

Y10S 707/99948   Application of database or ...

Keyframe selection to represent a video

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Keyframe selection to represent a video

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links