Method and system for video segmentation

US 20080124042A1
Filed: 11/07/2006
Published: 05/29/2008
Est. Priority Date: 11/07/2006
Status: Active Grant

First Claim

Patent Images

1. A computer implemented method for segmenting a video, in which the video includes video content and audio content, and the video content and the audio content are synchronized, comprising the steps of:

classifying each frame of audio content of a video with a label to generate a sequence of consecutive labels;

assigning a dominant label to each successive time interval of consecutive labels, in which a length of the time interval is substantially longer than a length of the frame;

constructing a semantic description for sliding time windows of the successive time intervals, in which the sliding time windows overlap in time and a length of each time window is substantially longer then the length of the time interval, and the semantic description for each time window is a transition matrix determined from transitions between the successive dominant labels of the time intervals;

determining a marker from the transition matrices, in which a frequency of occurrence of the marker is between a low frequency threshold and a high frequency threshold; and

segmenting the video at the locations of the markers in the audio content.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method segments a video. Audio frames of the video are classified with labels. Dominant labels are assigned to successive time intervals of consecutive labels. A semantic description is constructed for sliding time windows of the successive time intervals, in which the sliding time windows overlap in time, and the semantic description for each time window is a transition matrix determined from the dominant labels of the time intervals. A marker is determined from the transition matrices, in which a frequency of occurrence of the marker is between a low frequency threshold and a high frequency threshold. Then, the video is segmented at the locations of the markers.

15 Citations

View as Search Results

13 Claims

1. A computer implemented method for segmenting a video, in which the video includes video content and audio content, and the video content and the audio content are synchronized, comprising the steps of:
- classifying each frame of audio content of a video with a label to generate a sequence of consecutive labels;
  
  assigning a dominant label to each successive time interval of consecutive labels, in which a length of the time interval is substantially longer than a length of the frame;
  
  constructing a semantic description for sliding time windows of the successive time intervals, in which the sliding time windows overlap in time and a length of each time window is substantially longer then the length of the time interval, and the semantic description for each time window is a transition matrix determined from transitions between the successive dominant labels of the time intervals;
  
  determining a marker from the transition matrices, in which a frequency of occurrence of the marker is between a low frequency threshold and a high frequency threshold; and
  
  segmenting the video at the locations of the markers in the audio content.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13)
- - 2. The method of claim 1, further comprising:
    - constructing the transition matrix for the audio content of the entire video.
  - 3. The method of claim 1, further comprising:
    - constructing a histogram from each transition matrix.
  - 4. The method of claim 3, further comprising:
    - constructing the histogram for the audio content of the entire video.
  - 5. The method of claim 1, in which a number of transitions in the transition histogram of the entire video is equal to a number of labels in the audio content of the entire video.
  - 6. The method of claim 5, in which the transitions include self-transitions between the successive dominant labels.
  - 7. The method of claim 1, further comprising:
    - determining a transition difference at every time instant where the marker occurs in the video.
  - 8. The method of claim 7, further comprising:
    - comparing the transition difference to a first threshold Th₁to indicate a indicate a segmentation boundary in the video.
  - 9. The method of claim 1, further comprising:
    - normalizing the transition matrices.
  - 10. The method of claim 1, further comprising:
    - determining a semantic difference Diff_semanticfor each time window at time t_bbefore a current time t₀of a current time window and each time window at a time t_aafter the current time t₀as ${Diff}_{semantic} = \frac{1}{2} \sum_{i = 1}^{M} \frac{{[T_{a} (i, j) - T_{b} (i, j)]}^{2}}{T (i, j) + T (i, j)},$ where T_a(i,j) and T_b(i,j) are the transition matrices for the time windows at time t_aand time t_b;
      
      comparing Diff_semanticto a second threshold Th₂to indicate a segmentation boundary in the video.
  - 12. The method of claim 1, in which the segmenting further comprises:
    - determining a transition difference Diff_transitionfor a time spans t_bbefore a current time instant to and for a time spans t_aafter the time t₀for time span t_massociated with the marker according to ${Diff}_{transition} = \frac{Tm (i_{m}, j) \times t_{m}}{T_{b} (i_{m}, j) \times t_{b} + T_{a} (i_{m}, j) \times t_{a}},$ where i_mis the marker, T_a, T_b, and T_mare the transition matrices for the time periods t_a, t_b, and t_m, respectively;
      
      comparing the transition difference with threshold; and
      
      selecting a current time as a boundary when the transition difference is greater than threshold and is a local maximum.
  - 13. The method of claim 1, in which the marker is moderately dispersed throughout the video.

11. The method of 1, in which the low frequency threshold is about one in a three, and the high frequency threshold is about one in a hundred.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Inventors
Divakaran, Ajay, Goela, Naveen, Niu, Feng

Granted Patent

US 8,107,541 B2
Time in Patent Office

Days
Field of Search
US Class Current

386/52
CPC Class Codes

G06F 16/7834   using audio features

G06V 20/40   in video content extracting...

G10L 25/48   specially adapted for parti...

Method and system for video segmentation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

15 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for video segmentation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links