Pre-processing method and system for data reduction of video sequences and bit rate reduction of compressed video sequences using spatial filtering

US 7,809,207 B2
Filed: 08/04/2008
Issued: 10/05/2010
Est. Priority Date: 08/13/2003
Status: Active Grant

First Claim

Patent Images

1. A method for pre filtering processing an original video sequence, the original video sequence comprising a plurality of video frames, the method comprising:

for each video frame of the plurality of video frames of the original video sequence;

identifying a bounding geometric shape that encloses at least a portion of an important region-of-interest in the video frame, the bounding geometric shape serving as a foreground region;

identifying a portion of the video frame outside the bounding geometric shape as an unimportant background region;

applying a first filter operation in the foreground region and not in the background region, the first filter operation providing data reduction in the foreground region; and

applying a second filter operation in the background region and not in the foreground region, the second filter operation providing greater data reduction in the unimportant background region than the first filter operation would provide if applied to a same region; and

encoding the plurality of video frames after the first and second filter operations have been applied to each of the plurality of video frames.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods for pre-processing video sequences prior to compression to provide data reduction of the video sequence. Also, after compression of the pre-processed video sequence, the bit rate of the pre-processed and compressed video sequence will be lower than the bit rate of the video sequence after compression but without pre-processing. Pre-processing may include spatial anisotropic diffusion filtering such as Perona-Malik filtering, Fallah-Ford filtering, or omni-directional filtering that extends Perona-Malik filtering to perform filtering in at least one diagonal direction. Pre-processing may also include performing filtering differently on a foreground region than on a background region of a video frame. This method includes identifying pixel locations having pixel values matching characteristics of human skin and determining a bounding shape for each contiguous grouping of matching pixel locations. The foreground region is comprised of pixel locations contained in a bounding shape and the background region is comprised of all other pixel locations.

52 Citations

View as Search Results

32 Claims

1. A method for pre filtering processing an original video sequence, the original video sequence comprising a plurality of video frames, the method comprising:
- for each video frame of the plurality of video frames of the original video sequence;
  
  identifying a bounding geometric shape that encloses at least a portion of an important region-of-interest in the video frame, the bounding geometric shape serving as a foreground region;
  
  identifying a portion of the video frame outside the bounding geometric shape as an unimportant background region;
  
  applying a first filter operation in the foreground region and not in the background region, the first filter operation providing data reduction in the foreground region; and
  
  applying a second filter operation in the background region and not in the foreground region, the second filter operation providing greater data reduction in the unimportant background region than the first filter operation would provide if applied to a same region; and
  
  encoding the plurality of video frames after the first and second filter operations have been applied to each of the plurality of video frames.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the bounding geometric shape comprises one of at least four-sided, three-sided, and circular form.
  - 3. The method of claim 1, wherein each video frame comprises a plurality of pixel locations, wherein each pixel location in the region-of-interest has a chrominance value within a predetermined low chrominance threshold value and a predetermined high chrominance threshold value.
  - 4. The method of claim 1, wherein each video frame comprises a plurality of pixel locations, wherein the bounding geometric shape encloses over ½
    - of pixel locations in the region-of-interest.
  - 5. The method of claim 1, wherein the first filter operation provides data reduction while preserving sharp edges in the foreground region.
  - 6. The method of claim 1, wherein each video frame comprises a plurality of pixel locations, wherein each pixel location comprises a pixel value, the method further comprising identifying pixel locations that have pixel values in a range of a human skin tone to identify the bounding geometric shape.
  - 7. The method of claim 1, wherein the bounding geometric shape that encloses at least a portion of the region-of-interest is an approximation of the region-of-interest.

8. A method for processing an original video sequence, the original video sequence comprising a plurality of video frames, each video frame comprising a plurality of pixel locations, the method comprising:
- for each video frame;
  
  specifying a bounding shape that encloses at least a portion of a region-of-interest in the video frame from the plurality of video frames of the original video sequence;
  
  filtering pixel locations in the bounding shape differently than other pixel locations in the video frame;
  
  outputting a pre-filtered video sequence comprising a plurality of pre-filtered video frames; and
  
  compressing the pre-filtered video sequence using a compression method to produce a pre-filtered and compressed video sequence, wherein a bit rate associated with the pre-filtered and compressed video sequence is lower than a bit rate that would result from compressing the original video sequence using the compression method without performing the specifying of the bounding shape and the filtering of pixel locations.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The method of claim 8, wherein said filtering pixel locations in the bounding shape differently is for providing a greater data reduction of unimportant regions outside the bounding shape while preserving sharp edges for regions in the bounding shape.
  - 10. The method of claim 8, wherein the region-of-interest comprises pixel locations having pixel values that match characteristics of a human skin.
  - 11. The method of claim 8, wherein the region-of-interest comprises contiguous groupings of matching pixel locations.
  - 12. The method of claim 8, wherein the region-of-interest is defined by a spatial proximity, wherein matching pixel locations within a specified distance are grouped together in the region-of-interest.
  - 13. The method of claim 8, wherein the bounding shape is a geometric shape utilized to quickly bound the region-of-interest in the video frame.

14. A non-transitory computer readable storage medium storing a computer program for pre-filtering an original video sequence, the original video sequence comprising a plurality of video frames, the computer program executable by at least one processor, the computer program comprising sets of instructions for:
- specifying a bounding geometric shape that encloses at least a portion of a region-of-interest of a video frame from the plurality of video frames of the original video sequence; and
  
  applying a pre-filter operation to a region outside the bounding geometric shape and not to a region bounded by the bounding geometric shape, the pre-filter operation for reducing data content of unimportant regions in the video frame while preserving sharp edges inside the bounding geometric shape.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The non-transitory computer readable storage medium of claim 14, wherein each video frame comprises a plurality of pixel locations, wherein each pixel location in the region-of-interest has a chrominance value within a predetermined low chrominance threshold value and a predetermined high chrominance threshold value.
  - 16. The computer readable storage medium of claim 14, wherein the region-of-interest comprises a totality of regions in the video frame enclosed within multiple bounding shapes.
  - 17. The computer readable storage medium of claim 16, wherein a background region is comprised of a totality of regions in the video frame not enclosed within the multiple bounding shapes and the pre-filter operation is applied to the background region.
  - 18. The computer readable storage medium of claim 14 further comprising a set of instructions for applying a different pre-filter operation to regions inside the bounding geometric shape and not to the region outside of the bounding geometric shape.

19. A method for processing a plurality of video frames, each video frame comprising a plurality of pixel locations, each pixel location comprising a pixel value, the method comprising:
- automatically identifying a foreground region for a video frame by identifying a bounding geometric shape that encloses at least a portion of a region-of-interest in the video frame and associating a region enclosed by the bounding geometric shape with the foreground region;
  
  identifying a region outside the geometric bounding shape as a background region of the video frame;
  
  filtering the background region and not the foreground region with a pre-filter operation, the pre-filter operation for reducing data content of unimportant regions in the video frame while preserving sharp edges inside the bounding geometric shape; and
  
  after filtering the background region in the video frame, encoding the video frame.
- View Dependent Claims (20, 21)
- - 20. The method of claim 19, wherein automatically identifying the foreground region comprises identifying a set of pixel locations that comprises pixel values between a minimum chrominance value and a maximum chrominance value.
  - 21. The method of claim 20, wherein the minimum and maximum chrominance values are defined by a chrominance range for a human face or skin.

22. A method for pre-filtering a plurality of video frames to reduce data content for encoding, each video frame comprising a plurality of pixel locations, each pixel location comprising first and second pixel values, the method comprising:
- automatically identifying a foreground region for a video frame by identifying a region-of-interest in the video frame and associating the region-of-interest with the foreground region, wherein automatically identifying the foreground region comprises identifying, from the plurality of pixel locations, a set of pixel locations that each comprises (i) a first pixel value between a first minimum chrominance value and a first maximum chrominance value, and (ii) a second pixel value between a second minimum chrominance value and a second maximum chrominance value;
  
  identifying a background region of the video frame as pixel locations not in the foreground region; and
  
  filtering the video frame by filtering the foreground region and the background region differently.
- View Dependent Claims (23, 24, 25, 26)
- - 23. The method of claim 22, wherein automatically identifying the foreground region comprises specifying a bounding shape for at least a portion of the region-of-interest of the video frame.
  - 24. The method of claim 22, wherein automatically identifying the foreground region comprises:
    - identifying a plurality of regions in the video frame; and
      
      identifying the foreground region as a totality of said plurality of regions.
  - 25. The method of claim 22 further comprising combining the filtered background region and the filtered foreground region into a single filtered video frame.
  - 26. The method of claim 22, wherein said filtering the foreground region and the background region differently is for achieving a greater data reduction of the background region containing unimportant images over the foreground region containing important images in the video frame.

27. A method for pre-filtering a plurality of video frames for a video conference, wherein each video frame comprises a plurality of pixel locations, the method comprising:
- identifying a region-of-interest of a video frame of the plurality of video frames by identifying a plurality of pixel locations having attributes similar to a human skin tone;
  
  bounding, with a bounding geometric shape, an approximation of the region-of-interest to identify a foreground region as a region within the bounding geometric shape;
  
  defining a first binary mask for a pixel locations inside the bounding geometric shape;
  
  defining a second binary mask for a background region for the video frame, the background region covering pixel locations outside the bounding geometric shape;
  
  filtering the video frame by using the first binary mask to filter the foreground region using a first pre-filter operation and using the second binary mask to filter the background region using a second pre-filter operation; and
  
  combining the filtered foreground region and the filtered background region to form a single filtered video frame.
- View Dependent Claims (28, 29, 30, 31, 32)
- - 28. The method of claim 27, wherein the foreground region comprises a first set of pixel locations, and the background region comprises a second set of pixel locations.
  - 29. The method of claim 28, wherein the first binary mask comprises a plurality of values, each value corresponding to a pixel location of the video frame, wherein the value is a value of 1 for each pixel location from the first set of pixel locations, wherein the value is a value of 0 for each pixel location from the second set of pixel locations.
  - 30. The method of claim 28, wherein the second binary mask comprises a plurality of values, each value corresponding to a pixel location of the video frame, wherein the value is a value of 1 for each pixel location from the second set of pixel locations, wherein the value is a value of 0 for each pixel location from the first set of pixel locations.
  - 31. The method of claim 27, wherein the foreground region is filtered differently than the background region so that image data is reduced in the background region containing unimportant images more than the foreground region containing important images in the video frame.
  - 32. The method of claim 27, wherein the plurality of pixel locations having attributes similar to the human skin tone comprises pixel locations with pixel values approximating the human skin tone.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Dumitras, Adriana, Salsbury, Ryan R., Normile, James Oliver
Primary Examiner(s)
PATEL, KANJIBHAI B

Application Number

US12/185,777
Publication Number

US 20080292201A1
Time in Patent Office

792 Days
Field of Search

382232-233, 382260-264, 382274-275, 382/103, 382/282, 348/14.08, 348/14.13, 345/691, 345/694, 375/240.29, 375/254
US Class Current

382/260
CPC Class Codes

G06V 40/162   using pixel segmentation or...

H04N 19/117   Filters, e.g. for pre-proce...

H04N 19/136   Incoming video signal chara...

H04N 19/14   Coding unit complexity, e.g...

H04N 19/17   the unit being an image reg...

H04N 19/186   the unit being a colour or ...

H04N 19/192   the adaptation method, adap...

H04N 19/80   Details of filtering operat...

H04N 5/21   Circuitry for suppressing o...

H04N 9/8042   involving data reduction

Pre-processing method and system for data reduction of video sequences and bit rate reduction of compressed video sequences using spatial filtering

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

52 Citations

32 Claims

Specification

Use Cases

Quick Links

Others

Pre-processing method and system for data reduction of video sequences and bit rate reduction of compressed video sequences using spatial filtering

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

52 Citations

32 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others