System and method for segmenting image regions from a scene likely to represent particular objects in the scene

US 6,141,433 A
Filed: 12/24/1997
Issued: 10/31/2000
Est. Priority Date: 06/19/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method for extracting image information from a video frame for regions of the video frame that likely are objects of interest in a scene, comprising steps of:

.(a) generating a first set of regions based upon differences between image information for the video frame and image information for a background image of the scene;

(b) generating a second set of regions from the first set of regions based upon edge information for regions in the first set and edge information for the background image, wherein the step of generating a second set of regions comprises steps of;

(b)(1) extracting edge information from each region in the first set of regions;

(b)(2) extracting edge information for the background image;

(b)(3) comparing edge information for each region of the first set of regions with edge information for the background image;

(b)(4) generating a confidence value for each region in the first set of regions depending on whether pixels of a region and corresponding pixels in the background image represent edge information, wherein the step of generating a confidence value for each region in the first set of regions comprises steps of;

examining each pixel of a region;

if the pixel in the region represents an edge and a corresponding pixel in the background image represents an edge, then reducing the confidence value for the region;

if the pixel in the region does not represent an edge and a corresponding pixel in the background image represents an edge, then increasing the confidence value for the region; and

if the pixel in the region represents an edge and a corresponding pixel in the background image does not represent an edge, then increasing the confidence value for the region;

(b)(5) retaining regions from the first set of regions which have a confidence value greater than a predetermined confidence threshold; and

(c) generating a third set of regions from the second set of regions by combining regions in the second set with each other if resulting combined regions satisfy predetermined criteria.

View all claims

20 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for extracting image information from a video frame for regions of a the video frame that likely are objects of interest in a scene. An initial region set is generated by comparing luminance image information and color image information of a video frame with luminance image information and color image information of a background image for the scene. A high confidence region set is generated comprising regions from the initial based upon edge information of the regions and edge information in the background image. A final region set is generated by combining one or more regions in the high confidence region set if such combinations satisfy predetermined criteria, including size, region proximity and morphological region dilation.

133 Citations

38 Claims

1. A method for extracting image information from a video frame for regions of the video frame that likely are objects of interest in a scene, comprising steps of:
- .(a) generating a first set of regions based upon differences between image information for the video frame and image information for a background image of the scene;
  
  (b) generating a second set of regions from the first set of regions based upon edge information for regions in the first set and edge information for the background image, wherein the step of generating a second set of regions comprises steps of;
  
  (b)(1) extracting edge information from each region in the first set of regions;
  
  (b)(2) extracting edge information for the background image;
  
  (b)(3) comparing edge information for each region of the first set of regions with edge information for the background image;
  
  (b)(4) generating a confidence value for each region in the first set of regions depending on whether pixels of a region and corresponding pixels in the background image represent edge information, wherein the step of generating a confidence value for each region in the first set of regions comprises steps of;
  
  examining each pixel of a region;
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image represents an edge, then reducing the confidence value for the region;
  
  if the pixel in the region does not represent an edge and a corresponding pixel in the background image represents an edge, then increasing the confidence value for the region; and
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image does not represent an edge, then increasing the confidence value for the region;
  
  (b)(5) retaining regions from the first set of regions which have a confidence value greater than a predetermined confidence threshold; and
  
  (c) generating a third set of regions from the second set of regions by combining regions in the second set with each other if resulting combined regions satisfy predetermined criteria.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 30)
- - 2. The method of claim 1, wherein the step of generating the first set of regions comprises steps of:
    - (a)(1) generating a luminance difference image and a color difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) forming a composite image from the luminance difference image and the color difference image;
      
      (a)(3) comparing the composite image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(4) generating a gray interest image by masking the luminance image information for the video frame with the binary interest image; and
      
      (a)(5) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 3. The method of claim 2, and further comprising the step of:
    - (a)(6) adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the composite image is formed based on the adjusted luminance difference image and the color difference image.
  - 4. The method of claim 3, wherein the step of adjusting values of pixels in the luminance difference image comprises multiplying a value of each pixel in the luminance difference image by a factor which is proportional to a luminance intensity of the corresponding pixel in the background image.
  - 5. The method of claim 2, wherein the luminance image information is represented by a Y component of YUV image information for the video frame and the color image information is represented by a U component and a V component of YUV image information for the video frame, and wherein the step of generating a luminance difference image and a color difference image comprises:
    - (a)(1)(i) generating a Y difference image representing a difference between a Y component of the image information for the video frame and a Y component of the image information for the background image;
      
      (a)(1)(ii) generating a U difference image representing a difference between a U component of the image information for the video frame and a U component of the image information for the background image; and
      
      (a)(1)(iii) generating a V difference image representing a difference between a V component of the image information for the video frame and a V component of the image information for the background image.
  - 6. The method of claim 5, wherein the step of forming the composite image comprises steps of:
    - (a)(3)(i) forming a combined UV difference image from the U difference image and the V difference image; and
      
      (a)(3)(ii) combining the combined UV difference image with the Y difference image.
  - 7. The method of claim 6, and further comprising the step of:
    - (a)(3)(iii) weighting the Y difference image by a predetermined emphasis factor to emphasize either color differences or intensity differences in the composite image.
  - 8. The method of claim 5, wherein the step of generating a gray interest image comprises masking the Y component of the image information for the video frame with the binary interest image.
  - 9. The method of claim 1, wherein the step of generating the first set of regions comprises steps of:
    - (a)(1) generating a luminance difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) comparing the luminance difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(3) generating a gray interest image by masking the luminance image information of the video frame with the binary interest image; and
      
      (a)(4) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 10. The method of claim 9, and further comprising the step of:
    - (a)(5) adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the adjusted luminance difference image is compared with the predetermined image difference threshold to generate the binary interest image.
  - 11. The method of claim 1, wherein the step of generating the first set of regions comprises steps of:
    - (a)(1) generating a color difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) comparing the color difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(3) generating a gray interest image by masking color image information of the video frame with the binary interest image; and
      
      (a)(4) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 12. The method of claim 1, wherein the step of generating the third set of regions comprises steps of:
    - (c)(1) determining which regions in the second set can be combined with other regions to form a region pair if the region pair satisfies predetermined criteria including at least one of, a predetermined size limit, sufficient proximity of the regions in the region pair, and overlap of morphological dilated versions of the regions in the region pair; and
      
      (c)(2) comparing region pairs with each other and merging together region pairs to form groups of regions from region pairs which have a common region.
  - 30. The computer-readable medium of claim 9, wherein the instructions for adjusting the pixels in the luminance difference image comprise instructions for multiplying a value of each pixel in the luminance difference image by a factor which is proportional to a luminance intensity of the corresponding pixel in the background image.

13. A system for extracting image information from a video frame for regions of the video frame that likely are objects of interest in a scene, comprising:
- (a) a video camera positioned to monitor the scene and generating video signals representing activity within the scene;
  
  (b) a frame grabber coupled to the video camera to generate a stream of video frames from the video signal, each video frame comprising image information of the scene at an instant of time;
  
  (c) a processor coupled to the frame grabber, the processor being programmed to;
  
  (c)(1) generate a first set of regions based upon differences between image information for the video frame and image information for a background image of the scene;
  
  (c)(2) generate a second set of regions from the first set of regions based upon edge information for regions in the first set and edge information for the background image by;
  
  (c)(2)(i) extracting edge information from each region in the first set of regions;
  
  (c)(2)(ii) extracting edge information for the background image;
  
  (c)(2)(iii) comparing edge information for each region of the first set of regions with edge information for the background image;
  
  (c)(2)(iv) generating a confidence value for each region in the first set of regions depending on whether pixels of a region and corresponding pixels in the background image represent edge information, wherein the processor is programmed to generate a confidence value for each region in the first set of regions by;
  
  examining each pixel of a region;
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image represents an edge, then reducing the confidence value for the region;
  
  if the pixel in the region does not represent an edge and a corresponding pixel in the background image represents an edge, then increasing the confidence value for the region; and
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image does not represent an edge, then increasing the confidence value for the region; and
  
  (c)(2)(v) retaining regions from the first set of regions which have a confidence value greater than a predetermined confidence threshold; and
  
  (c)(3) generate a third set of regions from the second set of regions by combining regions in the second set with each other if resulting combined regions satisfy predetermined criteria.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
- - 14. The system of claim 13, wherein the processor is programmed to generate the first set of regions by:
    - (c)(1)(i) generating a luminance difference image and a color difference image based upon image information for the video frame and image information for the background image;
      
      (c)(1)(ii) forming a composite image from the luminance difference image and the color difference image;
      
      (c)(1)(iii) comparing the composite image with a predetermined image difference threshold to generate a binary interest image;
      
      (c)(1)(iv) generating a gray interest image by masking the luminance image information for the video frame with the binary interest image; and
      
      (c)(1)(v) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 15. The system of claim 14, wherein the processor is further programmed to perform the step of:
    - (c)(1)(ii) adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the composite image is formed based on the adjusted luminance difference image and the color difference image.
  - 16. The system of claim 15, wherein the processor is further programmed to adjust values of pixels in the luminance difference image comprises multiplying a value of each pixel in the luminance difference image by a factor which is proportional to a luminance intensity of the corresponding pixel in the background image.
  - 17. The system of claim 14, wherein the video camera generates a color video signal comprising luminance image information and color image information, the luminance image information represented by a Y component of YUV image information and the color image information comprises a U component and a V component of the YUV image information, wherein the processor is programmed to generate the luminance difference image and the color difference image by:
    - (c)(1)(i)(A) generating a Y difference image representing a difference between a Y component of the image information for the video frame and a Y component of the image information for the background image;
      
      (c)(1)(i)(B) generating a U difference image representing a difference between a U component of the image information for the video frame and a U component of the image information for the background image; and
      
      (c)(1)(i)(C) generating a V difference image representing a difference between a V component of the image information for the video frame and a V component of the image information for the background image.
  - 18. The system of claim 17, wherein the processor is programmed to form the composite image by:
    - (c)(1)(iii)(A) forming a combined UV difference image from the U difference image and the V difference image; and
      
      (c)(1)(iii)(B) combining the combined UV difference image with the Y difference image.
  - 19. The system of claim 18, wherein the processor is further programmed to form the composite image by:
    - (c)(1)(iii)(C) weighting the Y difference image by a predetermined emphasis factor to emphasize either color differences or intensity differences in the composite image.
  - 20. The system of claim 17, wherein the processor is programmed to generate a gray interest image by masking the Y component of the image information for the video frame with the binary interest image.
  - 21. The system of claim 13, wherein the processor is programmed to generate the first set of regions by:
    - (c)(1)(i) generating a luminance difference image based upon image information for the video frame and image information for the background image;
      
      (c)(1)(ii) comparing the luminance difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (c)(1)(iii) generating a gray interest image by masking the luminance image information of the video frame with the binary interest image; and
      
      (c)(1)(iv) extracting from the gray interest image those regions that are connected and have similar gray levels to form the initial region set.
  - 22. The system of claim 21, wherein the processor is further programmed to adjust the luminance difference image by:
    - (c)(1)(v) adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the adjusted luminance difference image is compared with the predetermined image difference threshold to generate the binary interest image.
  - 23. The system of claim 13, wherein the processor is programmed to generate the first set of regions by:
    - (c)(1)(i) generating a luminance difference image based upon image information for the video frame and image information for the background image;
      
      (c)(1)(ii) comparing the luminance difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (c)(1)(iii) generating a gray interest image by masking the luminance image information of the video frame with the binary interest image; and
      
      (c)(1)(iv) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 24. The system of claim 23, wherein the processor is further programmed to:
    - (c)(1)(v) adjust values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the adjusted luminance difference image is compared with the predetermined image difference threshold to generate the binary interest image.
  - 25. The system of claim 13, wherein the processor is programmed to generate the first set of regions by:
    - (c)(1)(i) generating a color difference image based upon image information for the video frame and image information for the background image;
      
      (c)(1)(ii) comparing the color difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (c)(1)(iii) generating a gray interest image by masking the color image information of the video frame with the binary interest image; and
      
      (c)(1)(iv) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 26. The system of claim 13, wherein the processor is programmed to generate the third set of regions by:
    - (c)(3)(i) determining which regions in the second set can be combined with other regions to form a region pair if the region pair satisfies predetermined criteria including at least one of, a predetermined size limit, sufficient proximity of the regions, and overlap of morphological dilated versions of the regions; and
      
      (c)(3)(ii) comparing region pairs with each other and merging together region pairs to form groups of regions from regions pairs which have a common region.

27. A computer-readable medium storing executable instructions which cause a computer to extract image information from a video frame for regions of the video frame that likely are objects of interest in the scene, by:
- (a) generating a first set of regions based upon differences between image information for the video frame and image information for a background image of the scene;
  
  (b) generating a second set of regions from the first set of regions based upon edge information for regions in the first set and edge information for the background image and generating a second set of regions, wherein the step of generating the second set of regions comprises steps of(b)(1) extracting edge information from each region in the first set of regions;
  
  (b)(2) extracting edge information for the background image;
  
  (b)(3) comparing edge information for each region of the first set of regions with edge information for the background image;
  
  (b)(4) generating a confidence value for each region in the first set of regions depending on whether pixels of a region and corresponding pixels in the background image represent edge information, wherein the instructions for generating a confidence value for each region in the first set comprise instructions for;
  
  examining each pixel of a region;
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image represents an edge, then reducing the confidence value for the region;
  
  if the pixel in the region does not represent an edge and a corresponding pixel in the background image represents an edge, then increasing the confidence value for the region; and
  
  if the pixel in the region represents an edge and a corresponding pixel in the background image does not represent an edge, then increasing the confidence value for the region;
  
  (b)(5) retaining regions from the first set of regions which have a confidence value greater than a predetermined confidence threshold; and
  
  (c) generating a third set of regions from the second set of regions by combining regions in the second set with each other if resulting combined regions satisfy predetermined criteria.
- View Dependent Claims (28, 29, 31, 32, 33, 34, 35, 36, 37, 38)
- - 28. The computer-readable medium of claim 27, wherein the executable instructions for generating the first set of regions comprise instructions for:
    - (a)(1) generating a luminance difference image and a color difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) forming a composite image from the luminance difference image and the color difference image;
      
      (a)(3) comparing the composite image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(4) generating a gray interest image by masking the luminance image information for the video frame with the binary interest image; and
      
      (a)(5) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 29. The computer-readable medium of claim 28, and further comprising instructions for:
    - (a)(6) adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the composite image is formed based on the adjusted luminance difference image and the color difference image.
  - 31. The computer-readable medium of claim 28, wherein the luminance image information comprises a Y component of YUV image information for the video frame and the color image information is represented by a U component and a V component of YUV image information for the video frame, and wherein the instructions for generating a luminance difference image and a color difference image comprise instructions for:
    - (a)(1)(i) generating a Y difference image representing a difference between a Y component of the image information for the video frame and a Y component of the image information for the background image;
      
      (a)(1)(ii) generating a U difference image representing a difference between a U component of the image information for the video frame and a U component of the image information for the background image; and
      
      (a)(1)(iii) generating a V difference image representing a difference between a V component of the image information for the video frame and a V component of the image information for the background image.
  - 32. The computer-readable medium of claim 31, wherein the instructions for forming a composite image comprise instructions for:
    - (a)(3)(i) forming a combined UV difference image from the U difference image and the V difference image; and
      
      (a)(3)(ii) combining the combined UV difference image with the Y difference image.
  - 33. The computer-readable medium of claim 32, wherein the instructions for forming the composite image further comprise instructions for:
    - (a)(3)(iii) weighting the Y difference image by a predetermined emphasis factor to emphasize either color differences or intensity differences in the composite image.
  - 34. The computer-readable medium of claim 32, wherein the instructions for generating a gray interest image comprise masking the Y component of the image information for the video frame with the binary interest image.
  - 35. The computer-readable medium of claim 27, wherein the instructions for generating the first set of regions comprise instructions for:
    - (a)(1) generating a luminance difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) comparing the luminance difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(3) generating a gray interest image by masking the luminance image information of the video frame with the binary interest image; and
      
      (a)(4) extracting from the gray interest image those regions that are connected and have similar gray levels to form the initial region set.
  - 36. The computer-readable medium of claim 35, and further comprising instructions for:
    - adjusting values of pixels in the luminance difference image based upon values of corresponding pixels in the background image to generate an adjusted luminance difference image;
      
      wherein the adjusted luminance difference image is compared with the predetermined image difference threshold to generate the binary interest image.
  - 37. The computer-readable medium of claim 27, wherein the instructions for generate the first set of regions comprise instructions for:
    - (a)(1) generating a color difference image based upon image information for the video frame and image information for the background image;
      
      (a)(2) comparing the color difference image with a predetermined image difference threshold to generate a binary interest image;
      
      (a)(3) generating a gray interest image by masking color image information of the video frame with the binary interest image; and
      
      (a)(4) extracting from the gray interest image those regions that are connected and have similar gray levels.
  - 38. The computer-readable medium of claim 27, wherein the instructions for generating the third set of regions comprise instructions for:
    - (c)(1) determining which regions in the second set can be combined with other regions to form a region pair if the region pair satisfies predetermined criteria including at least one of, a predetermined size limit, sufficient proximity of the regions, and overlap of morphological dilated versions of the regions; and
      
      (c)(2) comparing region pairs with each other and merging together region pairs to form groups or regions from region pairs which have a common region.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
FLIR Commercial Systems Incorporated (Teledyne Technologies Incorporated)
Original Assignee
NCR Corporation
Inventors
Moed, Michael C., Crabtree, Ralph N.
Primary Examiner(s)
Au, Amelia
Assistant Examiner(s)
Dastouri, Mehrdad

Application Number

US08/998,211
Time in Patent Office

1,042 Days
Field of Search

382/103, 382/164, 382/165, 382/236, 382/257, 348/143, 348/152, 348/155, 348/169, 348/700, 348/701, 348/590, 348/591, 348/592
US Class Current

382/103
CPC Class Codes

G01S 3/7865   using correlation of the li...

G06T 2207/10016   Video; Image sequence

G06T 7/12   Edge-based segmentation

G06T 7/155   involving morphological ope...

G06T 7/174   involving the use of two or...

G06T 7/246   using feature-based methods...

G06V 40/161   Detection; Localisation; No...

G07G 1/0054   with control of supplementa...

G07G 3/006   False operation

G08B 13/19602   Image analysis to detect mo...

G08B 13/19604   involving reference image o...

G08B 13/19608   Tracking movement of a targ...

H04N 5/142   Edging; Contouring

System and method for segmenting image regions from a scene likely to represent particular objects in the scene

First Claim

20 Assignments

0 Petitions

Accused Products

Abstract

133 Citations

38 Claims

Specification

Use Cases

Quick Links

Others

System and method for segmenting image regions from a scene likely to represent particular objects in the scene

First Claim

20 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

133 Citations

38 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others