Multi-mode region-of-interest video object segmentation

US 8,150,155 B2
Filed: 02/07/2006
Issued: 04/03/2012
Est. Priority Date: 02/07/2006
Status: Active Grant

First Claim

Patent Images

1. A method performed by a video coding device, the method comprising:

receiving a video frame of a video sequence;

applying, in the video coding device, one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein each of the first and second segmentation modes comprise modes of segmenting region of interest (ROI) objects from the video frame and wherein applying the one or more segmentation mode decision factors to select the segmentation mode comprises determining a computational complexity of the video frame by determining a number of ROI features within the video frame and selecting the first segmentation mode when the computation complexity is above a pre-determined level;

segmenting, in the video coding device, an ROI object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and

segmenting, in the video coding device, an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The disclosure is directed to techniques for automatic segmentation of a region-of-interest (ROI) video object from a video sequence. ROI object segmentation enables selected ROI or “foreground” objects of a video sequence that may be of interest to a viewer to be extracted from non-ROI or “background” areas of the video sequence. Examples of a ROI object are a human face or a head and shoulder area of a human body. The disclosed techniques include a hybrid technique that combines ROI feature detection, region segmentation, and background subtraction. In this way, the disclosed techniques may provide accurate foreground object generation and low-complexity extraction of the foreground object from the video sequence. A ROI object segmentation system may implement the techniques described herein. In addition, ROI object segmentation may be useful in a wide range of multimedia applications that utilize video sequences, such as video telephony applications and video surveillance applications.

137 Citations

View as Search Results

34 Claims

1. A method performed by a video coding device, the method comprising:
- receiving a video frame of a video sequence;
  
  applying, in the video coding device, one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein each of the first and second segmentation modes comprise modes of segmenting region of interest (ROI) objects from the video frame and wherein applying the one or more segmentation mode decision factors to select the segmentation mode comprises determining a computational complexity of the video frame by determining a number of ROI features within the video frame and selecting the first segmentation mode when the computation complexity is above a pre-determined level;
  
  segmenting, in the video coding device, an ROI object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segmenting, in the video coding device, an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the different video frame is a previous video frame in the video sequence.
  - 3. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining a desired segmentation quality of the video frame from an end-user, and selecting the first segmentation mode when the desired segmentation quality is above a pre-determined level.
  - 4. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining an amount of similarity between the video frame and the different video frame of the video sequence, and selecting the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 5. The method of claim 4, wherein determining the amount of similarity comprises comparing a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 6. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining an amount of motion activity between the video frame and the different video frame of the video sequence, and selecting the first segmentation mode when the amount of motion activity is above a pre-determined level.
  - 7. The method of claim 6, wherein determining the amount of motion activity comprises comparing a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 8. The method of claim 1, wherein the different video frame directly precedes the video frame of the video sequence, wherein applying the one or more segmentation mode decision factors comprises determining the segmentation mode used to segment the different video frame of the video sequence, and selecting the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 9. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode, and selecting the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

10. A non-transitory computer-readable medium having stored thereon instructions that when executed by a programmable processor cause the programmable processor to:
- receive a video frame of a video sequence;
  
  apply one or more segmentation mode decision factors to the video frame to select a a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein each of the first and second segmentation modes comprise modes of segmenting region of interest (ROI) objects from the video frame andwherein to apply the one or more segmentation mode decision factors to select the segmentation mode, the instructions cause the processor to determine a computational complexity of the video frame by determining a number of ROI features within the video frame and selecting select first segmentation mode when the computation complexity is above a pre-determined level;
  
  segment an ROI object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segment an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. The non-transitory computer-readable medium of claim 10, wherein the instructions cause the programmable processor to determine a desired segmentation quality of the video frame from an end-user, and select the first segmentation mode when the desired segmentation quality is above a pre-determined level.
  - 12. The non-transitory computer-readable medium of claim 10, wherein the instructions cause the programmable processor to determine an amount of similarity between the video frame and the different video frame of the video sequence, and select the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 13. The non-transitory computer-readable medium of claim 12, wherein the instructions cause the programmable processor to compare a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 14. The non-transitory computer-readable medium of claim 10, wherein the instructions cause the programmable processor to determine an amount of motion activity between the video frame and the different video frame of the video sequence, and select the first segmentation mode when the amount of motion activity is above a pre-determined level.
  - 15. The non-transitory computer-readable medium of claim 14, wherein the instructions cause the programmable processor to compare a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 16. The non-transitory computer-readable medium of claim 10, wherein the different video frame directly precedes the video frame of the video sequence, wherein the instructions cause the programmable processor to determine the segmentation mode used to segment the different video frame of the video sequence, and select the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 17. The non-transitory computer-readable medium of claim 10, wherein the instructions cause the programmable processor to determine a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode prior to the video frame, and select the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

18. A video encoding device including a processor programmed to:
- receive a video frame of a video sequence;
  
  apply one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein each of the first and second segmentation modes comprise modes of segmenting region of interest (ROI) objects from the video frame andwherein to apply the one or more segmentation mode decision factors to select the segmentation mode, the processor is configured to determine a computational complexity of the video frame by determining a number of ROI features within the video frame and selecting select first segmentation mode when the computation complexity is above a pre-determined level;
  
  segment an ROI object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segment an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
- - 19. The device of claim 18, wherein the processor determines a desired segmentation quality of the video frame from an end-user, and selects the first segmentation mode when the desired segmentation quality is above a pre-determined level.
  - 20. The device of claim 18, wherein the processor determines an amount of similarity between the video frame and the different video frame of the video sequence, and selects the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 21. The device of claim 20, wherein the processor compares a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 22. The device of claim 18, wherein the processor determines an amount of motion activity between the video frame and the different video frame of the video sequence, and selects the first segmentation mode when the amount of motion activity is above a pre-determined level.
  - 23. The device of claim 22, wherein the processor compares a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 24. The device of claim 18, wherein the different video frame directly precedes the video frame of the video sequence, and wherein the processor determines the segmentation mode used to segment the different video frame of the video sequence, and selects the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 25. The device of claim 18, wherein the processor determines a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode prior to the video frame, and selects the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

26. A video coding device comprising:
- means for receiving a video frame of a video sequence;
  
  means for applying one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein each of the first and second segmentation modes comprise modes of segmenting region of interest (ROI) objects from the video frame andwherein said applying means comprises means for determining a computational complexity of the video frame by determining a number of ROI features within the video frame and means for selecting the first segmentation mode when the computation complexity is above a pre-determined level;
  
  means for segmenting an ROI object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  means for segmenting an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34)
- - 27. The device of claim 26, wherein the different video frame is a previous video frame in the video sequence.
  - 28. The device of claim 26, wherein the means for applying the one or more segmentation mode decision factors comprises means for determining a desired segmentation quality of the video frame from an end-user, and means for selecting the first segmentation mode when the desired segmentation quality is above a pre-determined level.
  - 29. The device of claim 26, wherein the means for applying the one or more segmentation mode decision factors comprises means for determining an amount of similarity between the video frame and the different video frame of the video sequence, and means for selecting the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 30. The device of claim 29, wherein the means for determining the amount of similarity comprises means for comparing a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 31. The device of claim 26, wherein the means for applying the one or more segmentation mode decision factors comprises means for determining an amount of motion activity between the video frame and the different video frame of the video sequence, and means for selecting the first segmentation mode when the amount of motion activity is above a pre-determined level.
  - 32. The device of claim 31, wherein the means for determining the amount of motion activity comprises means for comparing a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 33. The device of claim 26, wherein the different video frame directly precedes the video frame of the video sequence, wherein the means for applying the one or more segmentation mode decision factors comprises means for determining the segmentation mode used to segment the different video frame of the video sequence, and means for selecting the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 34. The device of claim 26, wherein the means for applying the one or more segmentation mode decision factors comprises means for determining a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode, and means for selecting the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
El-Maleh, Khaled Helmi, Wang, Haohong
Primary Examiner(s)
Liew, Alex

Application Number

US11/349,659
Publication Number

US 20070183661A1
Time in Patent Office

2,247 Days
Field of Search

382173-180
US Class Current

382/173
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 2207/20036   Morphological image processing

G06T 2207/20132   Image cropping

G06T 2207/30201   Face

G06T 7/11   Region-based segmentation

G06T 7/174   involving the use of two or...

G06T 7/194   involving foreground-backgr...

G06T 7/215   Motion-based segmentation

G06V 10/25   Determination of region of ...

G06V 40/162   using pixel segmentation or...

G06V 40/165   using facial parts and geom...

G06V 40/167   using comparisons between t...

Multi-mode region-of-interest video object segmentation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

137 Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-mode region-of-interest video object segmentation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

137 Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links