System and method for video content analysis using depth sensing

US 9,247,211 B2
Filed: 01/17/2013
Issued: 01/26/2016
Est. Priority Date: 01/17/2012
Status: Active Grant

First Claim

Patent Images

1. A video content analysis method comprising:

receiving a video sequence that includes a plurality of frames, each frame including a video image;

for each frame, receiving two-dimensional (2D) image data and also receiving depth data;

processing the 2D image data of the video sequence to differentiate foreground data from background data and to detect one or more blobs comprised of the foreground data, the one or more blobs corresponding to one or more objects, wherein differentiating the foreground data from the background data is performed without analyzing the depth data;

for each detected blob, using the depth data to determine whether at least part of the blob corresponds to at least part of a target by at least (1) mapping the blob to a set of Z-planes;

(2) determining that on at least some Z-planes the blob is clustered into different blob regions corresponding to two objects separated in space; and

(3) grouping the separated blob regions of the Z-planes into two physical objects by checking their spatial overlaps, wherein one of the physical objects corresponds to the target; and

after it is determined that at least part of a blob corresponds to at least part of a target, tracking the target and detecting at least one event associated with the target.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for performing video content analysis based on two-dimensional image data and depth data are disclosed. Video content analysis may be performed on the two-dimensional image data, and then the depth data may be used along with the results of the video content analysis of the two-dimensional data for tracking and event detection.

Citations

22 Claims

1. A video content analysis method comprising:
- receiving a video sequence that includes a plurality of frames, each frame including a video image;
  
  for each frame, receiving two-dimensional (2D) image data and also receiving depth data;
  
  processing the 2D image data of the video sequence to differentiate foreground data from background data and to detect one or more blobs comprised of the foreground data, the one or more blobs corresponding to one or more objects, wherein differentiating the foreground data from the background data is performed without analyzing the depth data;
  
  for each detected blob, using the depth data to determine whether at least part of the blob corresponds to at least part of a target by at least (1) mapping the blob to a set of Z-planes;
  
  (2) determining that on at least some Z-planes the blob is clustered into different blob regions corresponding to two objects separated in space; and
  
  (3) grouping the separated blob regions of the Z-planes into two physical objects by checking their spatial overlaps, wherein one of the physical objects corresponds to the target; and
  
  after it is determined that at least part of a blob corresponds to at least part of a target, tracking the target and detecting at least one event associated with the target.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein using the depth data to determine whether at least part of each blob corresponds to at least part of a target includes:
    - using the depth data to determine that only part of a first blob corresponds to a first target.
  - 3. The method of claim 2, wherein using the depth data to determine whether at least part of each blob corresponds to at least part of a target includes:
    - using the depth data to determine that part of the first blob does not correspond to the first target.
  - 4. The method of claim 2, wherein using the depth data to determine whether at least part of each blob corresponds to at least part of a target includes:
    - using the depth data to determine that part of the first blob corresponds to a second target different from the first target.
  - 5. The method of claim 4, wherein a first part of the first blob corresponds to the first target and a second part of the first blob corresponds to a second target, one of the first and second target occluding at least part of the other.
  - 6. The method of claim 2, wherein using the depth data to determine whether at least part of each blob corresponds to at least part of a target includes:
    - using the depth data to determine that a second blob combined with part or all of the first blob correspond to a second target.
  - 7. The method of claim 1, wherein the 2D image data includes RGB data for each pixel in the video image.
  - 8. The method of claim 7, wherein only pixels of foreground data are projected onto the set of Z-planes.
  - 9. The method of claim 1, wherein determining whether at least part of each blob corresponds to at least part of a target is performed without analyzing depth data associated with the background data.
  - 10. The method of claim 1, wherein using the depth data to determine whether at least part of each blob corresponds to at least part of a target comprises:
    - using the depth data to determine one or more of a height and a volume of the blob; and
      
      using one or more of the height and the volume of the blob to determine whether at least part of the blob corresponds to a target.
  - 11. The method of claim 10, wherein determining whether the blob corresponds to a target includes determining whether the blob is a person.

12. A video content analysis method comprising:
- receiving a video sequence that includes a plurality of frames, each frame including a video image;
  
  for each frame, receiving two-dimensional (2D) image data and also receiving depth data;
  
  processing the 2D image data of the video sequence to differentiate foreground data from background data and to detect one or more blobs comprised of the foreground data, the one or more blobs corresponding to one or more objects, wherein differentiating the foreground data from the background data is performed without analyzing the depth data;
  
  for each detected blob, using the depth data to determine whether to track at least a first part of the blob as a target; and
  
  after it is determined to track the target, detecting at least one event associated with the target,wherein determining whether to track at least the first part of the blob as a target includes;
  
  mapping the blob to a set of Z-planes;
  
  determining that on at least some Z-planes the blob is clustered into different blob regions corresponding to the first part of the blob and a second part of the blob separated in space; and
  
  grouping blob slices corresponding to the Z-planes from the first part of the blob to correspond to a physical object by checking their spatial overlaps, wherein the physical object corresponds to the target.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 13. The method of claim 12, wherein using the depth data to determine whether to track at least part of the blob as a target includes:
    - determining that part of a first blob corresponds to a first target.
  - 14. The method of claim 13, wherein using the depth data to determine whether to track at least part of the blob as a target includes:
    - determining that part of the first blob does not correspond to the first target.
  - 15. The method of claim 13, wherein using the depth data to determine whether to track at least part of the blob as a target includes:
    - determining that a part of the first blob corresponds to a second target different from the first target.
  - 16. The method of claim 15, wherein determining that part of the first blob corresponds to the second target includes determining that the first blob corresponds to a first person and the second blob corresponds to a second person, one of the first and second person occluding at least part of the other.
  - 17. The method of claim 13, wherein using the depth data to determine whether to track at least part of the blob as a target includes:
    - determining that a second blob combined with part or all of the first blob corresponds to a second target.
  - 18. The method of claim 12, wherein the 2D image data includes RGB data for each pixel in the video image.
  - 19. The method of claim 18, wherein only pixels of foreground data are projected onto the set of Z-planes.
  - 20. The method of claim 19, wherein determining whether to track at least part of the blob as a target is performed without analyzing depth data associated with the background data.
  - 21. The method of claim 12, wherein using the depth data to determine whether to track at least part of the blob as a target includes:
    - using the depth data to determine one or more of a height and a volume of the blob; and
      
      using one or more of the height and volume of the blob to determine whether the blob corresponds to a target.
  - 22. The method of claim 21, wherein determining whether the blob corresponds to a target includes determining whether the blob is a person.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Motorola Solutions, Inc.
Original Assignee
Avigilon Fortress Corporation (Motorola Solutions, Inc.)
Inventors
Zhang, Zhong, Myers, Gary W., Venetianer, Peter L.
Primary Examiner(s)
Le, Vu
Assistant Examiner(s)
CESE, KENNY A

Application Number

US13/744,254
Publication Number

US 20130182904A1
Time in Patent Office

1,104 Days
Field of Search
US Class Current

1/1
CPC Class Codes

A61B 2505/07   Home care

A61B 5/0013   Medical image data A61B1/00...

A61B 5/0046   Arrangements of imaging app...

A61B 5/0077   Devices for viewing the sur...

A61B 5/1072   measuring distances on the ...

A61B 5/1073   Measuring volume, e.g. of l...

A61B 5/1079   using optical or photograph...

A61B 5/1113   Local tracking of patients,...

A61B 5/1116   Determining posture transit...

A61B 5/1117   Fall detection

A61B 5/1128   using image analysis A61B5/...

A61B 5/1176   Recognition of faces

A61B 5/7282   Event detection, e.g. detec...

A61B 5/746   Alarms related to a physiol...

G06T 2207/10016   Video; Image sequence

G06T 2207/30196   Human being; Person

G06T 2207/30232   Surveillance

G06T 7/0016   involving temporal comparison

G06T 7/246   using feature-based methods...

G06T 7/50   Depth or shape recovery

G06T 7/55 : from multiple images

G06T 7/579 : from motion

G06T 7/62 : of area, perimeter, diamete...

G06T 7/73 : using feature-based methods

G06V 20/41 : Higher-level, semantic clus...

G06V 20/44 : Event detection

G06V 20/52 : Surveillance or monitoring ...

G06V 40/103 : Static body considered as a...

G06V 40/1365 : Matching; Classification

G08B 13/19615 : wherein said pattern is def...

G08B 21/043 : detecting an emergency even...

G08B 21/0476 : Cameras to detect unsafe co...

H04N 7/18 : Closed-circuit television [...

H04N 7/181 : for receiving images from a...

View All

System and method for video content analysis using depth sensing

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for video content analysis using depth sensing

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links