Video surveillance system configured to analyze complex behaviors using alternating layers of clustering and sequencing

US 8,170,283 B2
Filed: 09/17/2009
Issued: 05/01/2012
Est. Priority Date: 09/17/2009
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:

receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and

modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques are disclosed for a video surveillance system to learn to recognize complex behaviors by analyzing pixel data using alternating layers of clustering and sequencing. A video surveillance system may be configured to observe a scene (as depicted in a sequence of video frames) and, over time, develop hierarchies of concepts including classes of objects, actions and behaviors. That is, the video surveillance system may develop models at progressively more complex levels of abstraction used to identify what events and behaviors are common and which are unusual. When the models have matured, the video surveillance system issues alerts on unusual events.

Citations

21 Claims

1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:
- receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and
  
  modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the first cluster layer generates clusters from the set of data inputs using a self organizing map (SOM) and an adaptive resonance theory (ART) network.
  - 3. The method of claim 2, wherein cluster layers subsequent to the first cluster layer generate clusters from sequences generated by a previous sequence layer and wherein the sequence layers generate sequences of clusters generated by a previous cluster layer.
  - 4. The method of claim 2, wherein the first cluster layer includes a dorsal side cluster layer and a ventral side cluster layer, and wherein the set of data inputs passed to the ventral side cluster layer includes a plurality of numerical kinematic data vectors characterizing a set of kinematics of the foreground objects in the scene and wherein the dorsal side cluster layer includes a plurality of numerical feature data vectors characterizing a set of micro features of the foreground objects.
  - 5. The method of claim 4, wherein the dorsal side cluster layer outputs a symbolic symbol stream passed to a first dorsal side sequence layer and wherein the ventral side cluster layer outputs a second symbolic symbol stream passed to a first ventral side sequence layer.
  - 6. The method of claim 5, wherein the first dorsal side sequence layer and the first ventral side sequence layer each include a voting experts component configured to induce one or more segments in the symbolic symbol stream passed to the respective first dorsal side sequence layer and the first ventral side sequence layer.
  - 7. The method of claim 4, wherein one of the layers of the plurality of layers combines the output of the dorsal side and the ventral side as the cross product of a symbolic symbol stream output by a sequence layer on the dorsal side and a symbolic symbol stream output by a sequence layer on the ventral side.

8. A computer-readable storage medium containing a program which, when executed by a processor, performs an operation for analyzing a sequence of video frames depicting a scene captured by a video camera, the operation comprising:
- receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and
  
  modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The computer-readable storage medium of claim 8, wherein the first cluster layer generates clusters from the set of data inputs using a self organizing map (SOM) and an adaptive resonance theory (ART) network.
  - 10. The computer-readable storage medium of claim 9, wherein cluster layers subsequent to the first cluster layer generate clusters from sequences generated by a previous sequence layer and wherein the sequence layers generate sequences of clusters generated by a previous cluster layer.
  - 11. The computer-readable storage medium of claim 9, wherein the first cluster layer includes a dorsal side cluster layer and a ventral side cluster layer, and wherein the set of data inputs passed to the ventral side cluster layer includes a plurality of numerical kinematic data vectors characterizing a set of kinematics of the foreground objects in the scene and wherein the dorsal side cluster layer includes a plurality of numerical feature data vectors characterizing a set of micro features of the foreground objects.
  - 12. The computer-readable storage medium of claim 11, wherein the dorsal side cluster layer outputs a symbolic symbol stream passed to a first dorsal side sequence layer and wherein the ventral side cluster layer outputs a second symbolic symbol stream passed to a first ventral side sequence layer.
  - 13. The computer-readable storage medium of claim 12, wherein the first dorsal side sequence layer and the first ventral side sequence layer each include a voting experts component configured to induce one or more segments in the symbolic symbol stream passed to the respective first dorsal side sequence layer and the first ventral side sequence layer.
  - 14. The computer-readable storage medium of claim 11, wherein one of the layers of the plurality of layers combines the output of the dorsal side and the ventral side as the cross product of a symbolic symbol stream output by a sequence layer on the dorsal side and a symbolic symbol stream output by a sequence layer on the ventral side.

15. A system, comprising:
- a video input source configured to provide a sequence of video frames, each depicting a scene;
  
  a processor; and
  
  a memory containing a program, which, when executed on the processor is configured to perform an operation for analyzing the scene, as depicted by the sequence of video frames captured by the video input source, the operation comprising;
  
  receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames, andmodeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The system of claim 15, wherein the first cluster layer generates clusters from the set of data inputs using a self organizing map (SOM) and an adaptive resonance theory (ART) network.
  - 17. The system of claim 16, wherein cluster layers subsequent to the first cluster layer generate clusters from sequences generated by a previous sequence layer and wherein the sequence layers generate sequences of clusters generated by a previous cluster layer.
  - 18. The system of claim 16, wherein the first cluster layer includes a dorsal side cluster layer and a ventral side cluster layer, and wherein the set of data inputs passed to the ventral side cluster layer includes a plurality of numerical kinematic data vectors characterizing a set of kinematics of the foreground objects in the scene and wherein the dorsal side cluster layer includes a plurality of numerical feature data vectors characterizing a set of micro features of the foreground objects.
  - 19. The system of claim 18, wherein the dorsal side cluster layer outputs a symbolic symbol stream passed to a first dorsal side sequence layer and wherein the ventral side cluster layer outputs a second symbolic symbol stream passed to a first ventral side sequence layer.
  - 20. The system of claim 19, wherein the first dorsal side sequence layer and the first ventral side sequence layer each include a voting experts component configured to induce one or more segments in the symbolic symbol stream passed to the respective first dorsal side sequence layer and the first ventral side sequence layer.
  - 21. The system of claim 18, wherein one of the layers of the plurality of layers combines the output of the dorsal side and the ventral side as the cross product of a symbolic symbol stream output by a sequence layer on the dorsal side and a symbolic symbol stream output by a sequence layer on the ventral side.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Motorola Solutions, Inc.
Original Assignee
Behavioral Recognition Systems Incorporated
Inventors
Cobb, Wesley Kenneth, Friedlander, David, Saitwal, Kishor Adinath, Seow, Ming-Jung, Xu, Gang
Primary Examiner(s)
AZARIAN, SEYED H

Application Number

US12/561,977
Publication Number

US 20110064268A1
Time in Patent Office

957 Days
Field of Search

382/100, 382/103, 382/106, 382/107, 382/155, 382/162, 382/168, 382/173, 382/181, 382/193, 382/199, 382/209, 382/218, 382/219, 382/224, 382/232, 382/254, 382/274, 382/276, 382/282, 382286-294, 382/305, 382/312, 707/791, 340/573.1, 340/948
US Class Current

382/103
CPC Class Codes

G06F 18/23211   with adaptive number of clu...

G06F 18/2433   Single-class perspective, e...

G06V 10/763   Non-hierarchical techniques...

G06V 10/764   using classification, e.g. ...

G06V 20/52   Surveillance or monitoring ...

G06V 40/20   Movements or behaviour, e.g...

Video surveillance system configured to analyze complex behaviors using alternating layers of clustering and sequencing

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Video surveillance system configured to analyze complex behaviors using alternating layers of clustering and sequencing

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links