Method for segmenting 3D objects from compressed videos

US 7,142,602 B2
Filed: 05/21/2003
Issued: 11/28/2006
Est. Priority Date: 05/21/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A method for segmenting a three dimensional object from a compressed video, the compressed video including a plurality of frames separated in time, and each frame including a plurality of macro-blocks separated in space, comprising:

parsing transformed coefficients for each macro block;

determining a spatial/temporal gradient for each macro-block based on the transformed coefficients;

selecting a particular macro-block with a minimum spatial/temporal gradient magnitude as a seed macro-block;

measuring distances between the seed macro-block and spatially and temporally adjacent macro-blocks based on the transformed coefficients; and

growing a volume around the seed macro-block using the adjacent macro-blocks having distances less than a predetermined threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method segments a video into objects, without user assistance. An MPEG compressed video is converted to a structure called a pseudo spatial/temporal data using DCT coefficients and motion vectors. The compressed video is first parsed and the pseudo spatial/temporal data are formed. Seeds macro-blocks are identified using, e.g., the DCT coefficients and changes in the motion vector of macro-blocks.

A video volume is “grown” around each seed macro-block using the DCT coefficients and motion distance criteria. Self-descriptors are assigned to the volume, and mutual descriptors are assigned to pairs of similar volumes. These descriptors capture motion and spatial information of the volumes. Similarity scores are determined for each possible pair-wise combination of volumes. The pair of volumes that gives the largest score is combined iteratively. In the combining stage, volumes are classified and represented in a multi-resolution coarse-to-fine hierarchy of video objects.

37 Citations

View as Search Results

23 Claims

1. A method for segmenting a three dimensional object from a compressed video, the compressed video including a plurality of frames separated in time, and each frame including a plurality of macro-blocks separated in space, comprising:
- parsing transformed coefficients for each macro block;
  
  determining a spatial/temporal gradient for each macro-block based on the transformed coefficients;
  
  selecting a particular macro-block with a minimum spatial/temporal gradient magnitude as a seed macro-block;
  
  measuring distances between the seed macro-block and spatially and temporally adjacent macro-blocks based on the transformed coefficients; and
  
  growing a volume around the seed macro-block using the adjacent macro-blocks having distances less than a predetermined threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 2. The method of claim 1 wherein the plurality of frames are a single shot.
  - 3. The method of claim 1 wherein the plurality of frames include I-frames having DCT coefficients.
  - 4. The method of claim 3 wherein the plurality of frames include P-frames including motion vectors.
  - 5. The method of claim 1 wherein the transformed coefficients are wavelets.
  - 6. The method of claim 1 wherein the transformed coefficients are fast Fourier transform coefficients.
  - 7. The method of claim 1 wherein there is a set of transformed coefficients for each color channel of the compressed video.
  - 8. The method of claim 2 wherein the single shot is detected from the transformed coefficients.
  - 9. The method of claim 4 wherein the transformed coefficients of each macro-block are represented as spatial/temporal data P(m, n, t, k), where (m,n) represents a macro-block index within a particular frame t, and represents a particular set of transformed coefficients within the macro-block.
  - 10. The method of claim 9 wherein the spatial/temporal gradient magnitude is determined as $\langle$
    - ∇
      
      P ⁡
      
      ( m , n , t , k ) 
      
      = ∑
      
      k ⁢
      
      w ⁡
      
      ( k ) ⁡
      
      [ α
      
      m ⁢
      
      
      
      P ⁡
      
      ( m + h , n , t , k ) - P ⁡
      
      ( m - h , n , t , k ) 
      
      + α
      
      n ⁢
      
      
      
      P ⁡
      
      ( m , n + h , t , k ) - P ⁡
      
      ( m , n - h , t , k ) 
      
      + α
      
      t ⁢
      
      
      
      P ⁡
      
      ( m , n , t + h , k ) - P ⁡
      
      ( m , n , t - h , k ) 
      
      ] , where w(k) is a weight of a corresponding set of transformed coefficients, α
      
      _mand α
      
      _nare weights of the DCT coefficients, α
      
      _tis a weight of the motion vector, and h is a derivative step size.
  - 11. The method of claim 10 wherein the minimum spatial/temporal gradient magnitude is min|∇
    - VP(m, n, t, k)|.
  - 12. The method of claim 1 wherein the selecting, measuring, and growing are repeated until no macro-blocks remain to generate a plurality of volumes.
  - 13. The method of claim 9 wherein the distance between the seed macro-block v and a particular adjacent macro-block q isd(v, q)=∥
    - P(q)−
      
      v∥
      
      =∥
      
      P(m, n, t)−
      
      v∥
      
      , where ∥
      
      •
      
      ∥
      
      is a particular distance function.
  - 14. The method of claim 13 further comprising:
    - updating a feature vector v for the seed macro-block while growing the volume as $d \leq λ \Rightarrow {\begin{matrix} true & v = \frac{N v + P (m, n, t)}{N + 1} \\ false & N = N + 1 \end{matrix},$ where d is the measured distance, λ
      
      is the threshold, and N is a next adjacent macro-block.
  - 15. The method of claim 12 further comprising:
    - subsuming individual macro-blocks of a particular volume smaller than a predetermined size into larger similar ones of the plurality of volumes.
  - 16. The method of claim 1 further comprising:
    - assigning a set of self descriptors to the volume.
  - 17. The method of claim 16 wherein the self descriptors include an average of the transformed coefficients of the macro-blocks in the volume, a number of macro-blocks in the volume, a number of macro-blocks on a surface of the volume, a compactness ratio of the volume, a trajectory of the volume, a length of the trajectory, and averaged coordinates of the macro-blocks of the volume.
  - 18. The method of claim 12 further comprising:
    - assigning a set of mutual descriptors to each possible pair of volumes.
  - 19. The method of claim 18 wherein the mutual descriptors include an average distance between trajectories of the pair of volumes, a variance of the distance of the trajectories, a maximum distance between the trajectories, an average change in the distance between the trajectories, an accumulated distance change of the trajectories, a compactness of the pair of volumes, a color difference between the pair of volumes, a number of frames where the pair of volumes coexists.
  - 20. The method of claim 12 further comprising:
    - assigning a set of self descriptors to the volume; and
      
      assigning a set of mutual descriptors to each possible pair of volumes.
  - 21. The method of claim 20 further comprising:
    - merging the plurality of volumes according to the set of self descriptors and the set of mutual descriptors to segment the compressed video into a multi-resolution 3D video objects.
  - 22. The method of claim 21 wherein the merging is pair-wise.
  - 23. The method of claim 21 wherein the merged volumes are maintained in a video object tree.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Inventors
Porikli, Fatih M., Sun, Huifang, Divakaran, Ajay
Primary Examiner(s)
VO, TUNG T

Application Number

US10/442,417
Publication Number

US 20040233987A1
Time in Patent Office

1,287 Days
Field of Search

375/240.01, 375/240.12, 375/240.16, 375/240.19, 382/173, 382/291, 345/419
US Class Current

375/240.16
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 2207/20048   Transform domain processing

G06T 2207/20101   Interactive definition of p...

G06T 7/11   Region-based segmentation

G06T 7/187   involving region growing; i...

G06V 10/26   Segmentation of patterns in...

H04N 19/48   using compressed domain pro...

H04N 19/87   involving scene cut or scen...

Method for segmenting 3D objects from compressed videos

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

37 Citations

23 Claims

Specification

Use Cases

Quick Links

Others

Method for segmenting 3D objects from compressed videos

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

37 Citations

23 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others