Video surveillance system configured to analyze complex behaviors using alternating layers of clustering and sequencing
First Claim
1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:
- receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and
modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames.
6 Assignments
0 Petitions
Accused Products
Abstract
Techniques are disclosed for a video surveillance system to learn to recognize complex behaviors by analyzing pixel data using alternating layers of clustering and sequencing. A video surveillance system may be configured to observe a scene (as depicted in a sequence of video frames) and, over time, develop hierarchies of concepts including classes of objects, actions and behaviors. That is, the video surveillance system may develop models at progressively more complex levels of abstraction used to identify what events and behaviors are common and which are unusual. When the models have matured, the video surveillance system issues alerts on unusual events.
-
Citations
21 Claims
-
1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:
-
receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage medium containing a program which, when executed by a processor, performs an operation for analyzing a sequence of video frames depicting a scene captured by a video camera, the operation comprising:
-
receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames; and modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
a video input source configured to provide a sequence of video frames, each depicting a scene; a processor; and a memory containing a program, which, when executed on the processor is configured to perform an operation for analyzing the scene, as depicted by the sequence of video frames captured by the video input source, the operation comprising; receiving a set of data inputs derived by a computer vision engine configured to analyze pixels depicting a plurality of foreground objects in the sequence of video frames, and modeling behavior of the foreground objects in the scene by passing the received sensory data inputs to a first cluster layer of a plurality of layers, wherein the plurality of layers alternate between cluster layers and sequence layers, wherein the cluster layers generate clusters of sequences and the sequence layers generate sequences of clusters, and wherein progressively higher levels of the plurality of layers correspond to progressively more complex patterns of behavior engaged in by the foreground objects depicted in the sequence of video frames. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification