Multi-mode region-of-interest video object segmentation

US 8,605,945 B2
Filed: 04/02/2012
Issued: 12/10/2013
Est. Priority Date: 02/07/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A method comprising:

receiving a video frame of a video sequence;

determining an amount of motion activity between the video frame and the different video frame of the video sequenceapplying one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode comprising selecting the first segmentation mode when the amount of motion activity is above a pre-determined level;

segmenting a region of interest (ROI) object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and

segmenting an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The disclosure is directed to techniques for automatic segmentation of a region-of-interest (ROI) video object from a video sequence. ROI object segmentation enables selected ROI or “foreground” objects of a video sequence that may be of interest to a viewer to be extracted from non-ROI or “background” areas of the video sequence. Examples of a ROI object are a human face or a head and shoulder area of a human body. The disclosed techniques include a hybrid technique that combines ROI feature detection, region segmentation, and background subtraction. In this way, the disclosed techniques may provide accurate foreground object generation and low-complexity extraction of the foreground object from the video sequence. A ROI object segmentation system may implement the techniques described herein. In addition, ROI object segmentation may be useful in a wide range of multimedia applications that utilize video sequences, such as video telephony applications and video surveillance applications.

Citations

23 Claims

1. A method comprising:
- receiving a video frame of a video sequence;
  
  determining an amount of motion activity between the video frame and the different video frame of the video sequenceapplying one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode comprising selecting the first segmentation mode when the amount of motion activity is above a pre-determined level;
  
  segmenting a region of interest (ROI) object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segmenting an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the different video frame is a previous video frame in the video sequence.
  - 3. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining a desired segmentation quality of the video frame from an end-user, and selecting the first segmentation mode when the desired segmentation quality is above a pre-determined level.
  - 4. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining an amount of similarity between the video frame and the different video frame of the video sequence, and selecting the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 5. The method of claim 4, wherein determining the amount of similarity comprises comparing a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 6. The method of claim 1, wherein determining the amount of motion activity comprises comparing a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 7. The method of claim 1, wherein the different video frame directly precedes the video frame of the video sequence, wherein applying the one or more segmentation mode decision factors comprises determining the segmentation mode used to segment the different video frame of the video sequence, and selecting the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 8. The method of claim 1, wherein applying the one or more segmentation mode decision factors comprises determining a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode, and selecting the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

9. A non-transitory computer-readable storage medium having stored thereon instructions that cause a programmable processor to:
- receive a video frame of a video sequence;
  
  determine an amount of motion activity between the video frame and the different video frame of the video sequence;
  
  apply one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode comprising selecting the first segmentation mode when the amount of motion activity is above a pre-determined level;
  
  segment a region of interest (ROI) object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segment an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The non-transitory computer-readable storage medium of claim 9, wherein the instructions cause the programmable processor to determine a computational complexity of the video frame, and select the first segmentation mode when the computation complexity is above a pre-determined level.
  - 11. The non-transitory computer-readable storage medium of claim 9, wherein the instructions cause the programmable processor to determine an amount of similarity between the video frame and the different video frame of the video sequence, and select the first segmentation mode when the amount of similarity is below a pre-determined level.
  - 12. The non-transitory computer-readable storage medium of claim 11, wherein the instructions cause the programmable processor to compare a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 13. The non-transitory computer-readable storage medium of claim 9, wherein the instructions cause the programmable processor to compare a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 14. The non-transitory computer-readable storage medium of claim 9, wherein the different video frame directly precedes the video frame of the video sequence, wherein the instructions cause the programmable processor to determine the segmentation mode used to segment the different video frame of the video sequence, and select the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 15. The non-transitory computer-readable storage medium of claim 9, wherein the instructions cause the programmable processor to determine a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode prior to the video frame, and select the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

16. A video encoding device including a processor configured to:
- receive a video frame of a video sequence;
  
  determines an amount of motion activity between the video frame and the different video frame of the video sequence;
  
  apply one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode, wherein the processor is configured to select the first segmentation mode when the amount of similarity is below a pre-determined level;
  
  segment a region of interest (ROI) object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  segment an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.
- View Dependent Claims (17, 18, 19, 20, 21, 22)
- - 17. The device of claim 16, wherein the processor determines a computational complexity of the video frame, and selects the first segmentation mode when the computation complexity is above a pre-determined level.
  - 18. The device of claim 16, wherein the processor compares a first color histogram of the video frame with a second color histogram of the different video frame to determine an amount of similarity between the first and second color histograms.
  - 19. The device of claim 16, wherein the processor, and selects the first segmentation mode when the amount of motion activity is above a pre-determined level.
  - 20. The device of claim 19, wherein the processor compares a first location of the ROI object within the video frame with a second location of the ROI object within the different video frame to determine an amount of movement between the first and second locations.
  - 21. The device of claim 16, wherein the different video frame directly precedes the video frame of the video sequence, and wherein the processor determines the segmentation mode used to segment the different video frame of the video sequence, and selects the second segmentation mode when the different video frame was segmented in the first segmentation mode.
  - 22. The device of claim 16, wherein the processor determines a number of consecutive video frames of the video sequence that were segmented in the second segmentation mode prior to the video frame, and selects the first segmentation mode when the number of consecutive second segmentation mode video frames is above a pre-determined level.

23. A video coding device comprising:
- means for receiving a video frame of a video sequence;
  
  means for determining an amount of motion activity between the video frame and the different video frame of the video sequencemeans for applying one or more segmentation mode decision factors to the video frame to select a segmentation mode from at least a first segmentation mode and a second segmentation mode comprising selecting the first segmentation mode when the amount of motion activity is above a pre-determined level;
  
  means for segmenting a region of interest (ROI) object from the video frame without reference to motion information for the video frame when the first segmentation mode is selected; and
  
  means for segmenting an ROI object from the video frame based on motion information for the video frame and a different video frame of the video sequence when the second segmentation mode is selected.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
El-Maleh, Khaled Helmi, Wang, Haohong
Primary Examiner(s)
Liew, Alex

Application Number

US13/437,736
Publication Number

US 20120189168A1
Time in Patent Office

617 Days
Field of Search

382173-180
US Class Current

382/103
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 2207/20036   Morphological image processing

G06T 2207/20132   Image cropping

G06T 2207/30201   Face

G06T 7/11   Region-based segmentation

G06T 7/174   involving the use of two or...

G06T 7/194   involving foreground-backgr...

G06T 7/215   Motion-based segmentation

G06V 10/25   Determination of region of ...

G06V 40/162   using pixel segmentation or...

G06V 40/165   using facial parts and geom...

G06V 40/167   using comparisons between t...

Multi-mode region-of-interest video object segmentation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-mode region-of-interest video object segmentation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links