Methods of scene change detection and fade detection for indexing of video sequences
First Claim
Patent Images
1. A method of processing digital video data in compressed form, comprising:
- processing a sequence of digital video data in compressed form, including at least I-frames and P-frames temporally disposed between the I-frames, to determine corresponding DC image values for each of the I-frames;
detecting instances of differences in the DC image values for pairs of temporally adjacent ones of the I-frames exceeding a first threshold value, to identify a potential scene change between a pair of the I-frames;
determining bit allocation distribution differences for the P-frames temporally disposed between the pair of I-frames; and
detecting the occurrence of a scene change by comparing the determined bit allocation distribution differences over a predetermined number of neighboring P-frames temporally disposed between the pair of I-frames with each other to identify a normalized it rate difference for one of the neighboring P-frames which is greater than the normalized bit rate difference associated with any of its neighboring P-frames.
1 Assignment
0 Petitions
Accused Products
Abstract
This invention relates to methods of abrupt scene change detection and fade detection for indexing of MPEG-2 and MPEG-4 compressed video sequences. Abrupt scene change and fade-detection techniques applied to signals in compressed form have reasonable accuracy and the advantage of high simplicity since they are based on entropy decoding and do not require computationally expensive inverse Discrete Cosine Transformation (DCT).
-
Citations
21 Claims
-
1. A method of processing digital video data in compressed form, comprising:
-
processing a sequence of digital video data in compressed form, including at least I-frames and P-frames temporally disposed between the I-frames, to determine corresponding DC image values for each of the I-frames;
detecting instances of differences in the DC image values for pairs of temporally adjacent ones of the I-frames exceeding a first threshold value, to identify a potential scene change between a pair of the I-frames;
determining bit allocation distribution differences for the P-frames temporally disposed between the pair of I-frames; and
detecting the occurrence of a scene change by comparing the determined bit allocation distribution differences over a predetermined number of neighboring P-frames temporally disposed between the pair of I-frames with each other to identify a normalized it rate difference for one of the neighboring P-frames which is greater than the normalized bit rate difference associated with any of its neighboring P-frames. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
the digital video data in compressed form includes at least a first discrete cosine transform (DCT) coefficient associated with each block of each macroblock of each of the I-frames; and
the processing to determine the DC image values for each of the I-frames comprises averaging the first DCT coefficients for each block of each macroblock to form a set of DC image values for each I-frame; and
the detecting of instances of differences in the DC image values includes comparing sets of corresponding ones of the DC image values for the pairs of I-frames.
-
-
3. The method of claim 1, wherein:
-
the digital video data in compressed form includes at least a first discrete cosine transform (DCT) coefficient associated with each block of each macroblock of each object in each of the I-frames; and
the processing to determine the DC image values for each of the I-frames includes averaging the first DCT coefficients for each block of each macroblock of each object to form a set of DC image values for that object for that I-frame; and
the detecting of instances of differences in the DC image values includes comparing sets of the DC image values for corresponding objects in the pairs of I-frames.
-
-
4. The method of claim 3, wherein:
the corresponding objects are those objects occupying the closest corresponding space in the pairs of I-frames.
-
5. The method of claim 3, wherein:
the first threshold value is approximately 0.25.
-
6. The method of claim 1, wherein:
-
the determining of bit allocation distribution differences for the P-frames temporally disposed between the pair of I-frames includes;
for each object in each P-frame temporally disposed between the pair of I-frames, constructing a bit rate matrix including the number of bits required to encode each macroblock in its respective frame;
comparing the bit rate matrix for each object in each temporally adjacent pair of the P-frames temporally disposed between the pair of I-frames to determine a rate difference; and
normalizing the rate difference for each object in each such pair of P-frames.
-
-
7. The method of claim 6, wherein the bit rate difference represents the sum of the absolute values of macroblock by macroblock bit differences.
-
8. The method of claim 6, wherein the normalizing is performed by dividing by the total number of bits used to encode the object.
-
9. The method of claim 1, wherein the scene change is an abrupt scene change and the difference fro the one p-frame is greater by a factor substantially equal to 2.
-
10. The method of claim 1, wherein:
the predetermined number is at least five.
-
11. The method of claim 1 wherein the detecting of instances of differences in the DC image values includes:
-
comparing DC values for corresponding ones of macroblocks in the pairs of I-frames to determine differences between the DC image values;
summing absolute values of all such differences for all macroblocks in the pair of I-frames;
dividing the sum of absolute values by the sum of absolute values of the DC values for all the macroblocks to determine an average difference per macroblock; and
comparing the average difference to the first threshold value and identifying the potential scene change when the average difference is greater than the first threshold.
-
-
12. The method of claim 11 wherein:
the first threshold value is approximately 0.25.
-
13. The method of claim 1, further comprising:
-
detecting the scene change by also determining a number of positive and a number of negative DC residual coefficients in the P-frames temporally disposed between the pair of I-frames by;
determining DC residual coefficients, including sign information for each of the P-frames temporally disposed between the pair of I-frames; and
determining the number of positive and the number of negative DC residual coefficients in the P-frames temporally disposed between the pair of I-frames in excess of a second threshold value to locate fade-in and fade-out scene changes, respectively.
-
-
14. The method of claim 13, wherein the determining of the number of positive and the number of negative DC residual coefficients includes:
-
counting the number of blocks having positive DC components and the number of blocks having negative DC components in each of the P-frames temporally disposed between the pair of I-frames;
determining which count is grater and the sign of the greater for each such P-frame;
determining which count is greater and the sign of the greater for each such P-frame;
identifying each P-frame where the greater count is equal components in such frame;
identifying each group of pictures in which DC components of a particular sign consistently exceed those of opposite sign;
designating a fade-in scene change where the greater count in each such identified group of pictures is associated with a positive sign; and
designating a fade-out scene change where the greater count in each such identified group of pictures is associated with a negative sign.
-
-
15. The method of claim 14 wherein:
the identifying of each P-frame further comprises identifying each P-frame where the greater count is equal to or more than 60 percent of the non-zero DC components in such frame.
-
16. A method of processing digital video data in compressed form, comprising:
-
processing a sequence of digital video data in compressed form, including at least I-frames and P-frames temporally disposed between the I-frames, to determine corresponding DC image values for each of the I-frames; and
detecting instances of differences in the DC image values for pairs of temporally adjacent ones of the I-frames exceeding a threshold value, by comparing sets of the DC image values for corresponding objects in the pairs of I-frames, to identify a potential scene change between a pair of the I-frames;
wherein the corresponding objects in temporally adjacent I-frames have different areas, and wherein the comparing of sets of DC image values for corresponding objects in the pairs of I-frames includes summation of the differences in the DC image values for each of the objects weighted by its respective area.
-
-
17. A method of processing digital video data in compressed form, comprising:
-
processing a sequence of digital video data in compressed form, including at least I-frames and P-frames temporally disposed between the I-frames, each macroblock of each P-frame including motion vector bits and residue bits, to determine corresponding DC image values for each of the I-frames;
detecting instances of differences in the DC image values for pairs of temporally adjacent ones of the I-frames exceeding a first threshold value, to identify a potential scene change between a pair of the I-frames;
determining first bit allocation distribution differences for the P-frames temporally disposed between the pair of I-frames in excess of a second threshold value; and
detecting the occurrence of the scene change by comparing data corresponding to the motion vector bits of the macroblocks for the P-frames temporally disposed between the pair of I-frames which have a first bit allocation distribution difference in excess of the second threshold value, to determine second bit allocation distribution differences, and comparing data corresponding to the residue bits of the macroblocks for the P-frames temporally disposed between the pair of I-frames which have the first bit allocation distribution difference in excess of the second threshold value, to determine third bit allocation distribution differences. - View Dependent Claims (18, 19, 20, 21)
-
Specification