Visual attention model

US 6,670,963 B2
Filed: 01/17/2001
Issued: 12/30/2003
Est. Priority Date: 01/17/2001
Status: Expired due to Term

First Claim

Patent Images

1. An improved visual attention model of the type that segments a frame of a video sequence into regions for processing by a plurality of spatial features to produce a corresponding plurality of spatial importance maps, that compares the frame with a previous frame for processing to produce a temporal importance map, and that combines the spatial and temporal importance maps to produce a total importance map for the frame, wherein the improvement comprises the steps of:

adaptively segmenting the frame into the regions using color along with luminance;

processing the regions with a plurality of spatial features to produce the plurality of spatial importance maps;

processing the frame with the previous frame to produce the temporal importance map that is compensated for camera motion; and

combining the spatial and temporal importance maps based upon a weighting function derived from eye movement studies to produce the total importance map for the frame.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An improved visual attention model uses a robust adaptive segmentation algorithm to divide a current frame of a video sequence into a plurality of regions based upon both color and luminance, with each region being processed in parallel by a plurality of spatial feature algorithms including color and skin to produce respective spatial importance maps. The current frame and a previous frame are also processed to produce motion vectors for each block of the current frame, the motion vectors being compensated for camera motion, and the compensated motion vectors being converted to produce a temporal importance map. The spatial and temporal importance maps are combined using weighting based upon eye movement studies.

Citations

20 Claims

1. An improved visual attention model of the type that segments a frame of a video sequence into regions for processing by a plurality of spatial features to produce a corresponding plurality of spatial importance maps, that compares the frame with a previous frame for processing to produce a temporal importance map, and that combines the spatial and temporal importance maps to produce a total importance map for the frame, wherein the improvement comprises the steps of:
- adaptively segmenting the frame into the regions using color along with luminance;
  
  processing the regions with a plurality of spatial features to produce the plurality of spatial importance maps;
  
  processing the frame with the previous frame to produce the temporal importance map that is compensated for camera motion; and
  
  combining the spatial and temporal importance maps based upon a weighting function derived from eye movement studies to produce the total importance map for the frame.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. The visual attention model as recited in claim 1 wherein the adaptively segmenting step comprises the steps of:
3. The visual attention model as recited in claim 2 wherein the adaptive segmenting step further comprises the step of clipping the borders of the frame prior to the splitting step.
4. The visual attention model as recited in claim 1 wherein the spatial features comprise at least two selected from the set consisting of size, background, location, contrast, shape, color and skin.
5. The visual attention model as recited in claim 4 wherein the processing step for the contrast spatial feature is based on absolute values for the mean graylevels of a region being processed and its neighboring regions that share a 4-connected border, is limited to a constant multiplied by the number of 4-connected neighboring pixels, and takes into account Weber and deVries-Rose effects.
6. The visual attention model as recited in claim 4 wherein the processing step for the color spatial feature calculates the color contrast of a region being processed with respect to its background.
7. The visual attention model as recited in claim 4 wherein the processing step for the skin spatial feature uses a narrow range of color values and respective thresholds for min and max values for each element of the color values.
8. The visual attention model as recited in claim 4 wherein the processing step for the size spatial feature comprises the step of implementing a four threshold algorithm so that regions too small and too large are minimized.
9. The visual attention model as recited in claim 4 wherein the processing step for the background spatial feature comprises the step of using a minimum of the number of pixels in a region that shares a four-connected border with another region or of the number of pixels in a region that also borders a truncated edge of the frame.
10. The visual attention model as recited in claim 4 wherein the processing step for the location spatial feature comprises the step of considering various zones about a central area of the frame with lesser weights per zone decreasing from the central area.
11. The visual attention model as recited in claim 4 wherein the processing step for the space spatial feature comprises the step of reducing shape importance in regions that have many neighboring regions.
12. The visual attention model as recited in claim 1 wherein the combining step comprises the steps of:
- weighting each spatial importance map according to weights determined empirically from eye movement studies to produce a resultant spatial importance map;
  
  smoothing the resultant spatial importance map from frame to frame using a temporal smoothing algorithm to reduce noise and improve temporal consistency to produce a spatial importance map; and
  
  combining the spatial importance map with the temporal importance map to produce the total importance map.
13. The visual attention model as recited in claim 12 wherein the step of combining the spatial importance map with the temporal importance map comprises the step of linear weighting the spatial importance and temporal importance maps, the linear weighting step using a constant determined from the eye movement studies.
14. The visual attention model as recited in claim 1 the temporal importance map processing step comprises the steps of:
- calculating motion vectors for each block of the current frame using a hierarchical block matching algorithm;
  
  estimating from the motion vectors parameters of camera motion;
  
  compensating the motion vectors based upon the parameters of camera motion; and
  
  converting the compensated motion vectors into the temporal importance map.
15. The visual attention model as recited in claim 14 wherein the temporal importance map processing step further comprises the step of determining a flatness for each block so that motion vectors in texturally flat errors are set to zero in the compensated motion vectors prior to the converting step.
16. The visual attention model as recited in claim 14 further comprising the step of calculating an adaptive threshold for assigning importance to a particular motion of a region over a temporal window.
17. The visual attention model as recited in claim 16 wherein the adaptively calculating step includes the steps of:
- assigning a lower threshold value as the adaptive threshold when there are few and slow moving regions in the frame; and
  
  assigning a higher threshold value as the adaptive threshold when there are many and fast moving regions in the frame.
18. The visual attention model as recited in claim 14 further comprising the step of assigning further importance in the total importance map to a central area of the frame when the camera motion parameters indicate camera motion selected from the group consisting of zoom and pan.
19. The visual attention model as recited in claim 14 further comprising the step of assigning further importance in the total importance map to a central area of the frame when there is very high motion the video sequence.
20. The visual attention model as recited in claim 14 further comprising the step of assigning further importance in the total importance map to skin areas that are undergoing motion.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Project Giants, LLC
Original Assignee
Tektronix Incorporated (Fortive Corp.)
Inventors
Osberger, Wilfried M.
Primary Examiner(s)
Zimmerman, Mark
Assistant Examiner(s)
Nguyen, Kimbinh T.

Application Number

US09/764,726
Publication Number

US 20020126891A1
Time in Patent Office

1,077 Days
Field of Search

345/419, 345/582, 345/426, 345/589, 345/423, 345/629, 345/620, 340/995, 375/240.16, 382/154, 382/239, 382/269, 382/164, 382/165, 382/107, 382/170, 382/173, 382/180
US Class Current

345/629
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 7/11   Region-based segmentation

G06T 7/215   Motion-based segmentation

G06V 10/25   Determination of region of ...

G06V 10/993   Evaluation of the quality o...

G06V 20/40   in video content extracting...

H04N 19/186   the unit being a colour or ...

H04N 19/503   involving temporal predicti...

H04N 19/527   Global motion vector estima...

H04N 19/533   Motion estimation using mul...

Visual attention model

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Visual attention model

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links