Mapper component for multiple art networks in a video analysis system
First Claim
1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:
- receiving one or more data streams generated from the sequence of video frames, wherein a first one of the data streams provides a stream of context events generated by a computer vision engine, and wherein each context event provides kinematic data related to a foreground object observed by the computer vision engine in the sequence of video frames;
parsing the data streams to identify data inputs matching an input layer of one of a plurality of adaptive resonance theory (ART) networks, wherein each ART network is configured to generate clusters from the data inputs matching the input layer of a respective ART network, and wherein each cluster provides a statistical distribution of a characteristic of the scene derived from the data streams that has been observed to occur at a location in the scene corresponding to a location of the cluster;
passing the data inputs to the ART network with the matching input layer;
updating the generated clusters in the ART network with the matching input layer; and
evaluating the clusters of the ART network with the matching input layer to determine whether the data inputs passed to the ART network are indicative of an occurrence of a statistically relevant event, relative to the clusters in the ART network with the matching input layer; and
upon determining that a first one of the clusters in a first one of the plurality of ART networks has not been updated for a specified period of time, removing the first cluster from the first ART network.
5 Assignments
0 Petitions
Accused Products
Abstract
Techniques are disclosed for detecting the occurrence of unusual events in a sequence of video frames Importantly, what is determined as unusual need not be defined in advance, but can be determined over time by observing a stream of primitive events and a stream of context events. A mapper component may be configured to parse the event streams and supply input data sets to multiple adaptive resonance theory (ART) networks. Each individual ART network may generate clusters from the set of inputs data supplied to that ART network. Each cluster represents an observed statistical distribution of a particular thing or event being observed that ART network.
61 Citations
21 Claims
-
1. A computer-implemented method for analyzing a sequence of video frames depicting a scene captured by a video camera, the method comprising:
-
receiving one or more data streams generated from the sequence of video frames, wherein a first one of the data streams provides a stream of context events generated by a computer vision engine, and wherein each context event provides kinematic data related to a foreground object observed by the computer vision engine in the sequence of video frames; parsing the data streams to identify data inputs matching an input layer of one of a plurality of adaptive resonance theory (ART) networks, wherein each ART network is configured to generate clusters from the data inputs matching the input layer of a respective ART network, and wherein each cluster provides a statistical distribution of a characteristic of the scene derived from the data streams that has been observed to occur at a location in the scene corresponding to a location of the cluster; passing the data inputs to the ART network with the matching input layer; updating the generated clusters in the ART network with the matching input layer; and evaluating the clusters of the ART network with the matching input layer to determine whether the data inputs passed to the ART network are indicative of an occurrence of a statistically relevant event, relative to the clusters in the ART network with the matching input layer; and upon determining that a first one of the clusters in a first one of the plurality of ART networks has not been updated for a specified period of time, removing the first cluster from the first ART network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer-readable medium containing a program which, when executed by a processor, performs an operation for analyzing a sequence of video frames depicting a scene captured by a video camera, the operation comprising:
-
receiving one or more data streams generated from the sequence of video frames, wherein a first one of the data streams provides a stream of context events generated by a computer vision engine, and wherein each context event provides kinematic data related to a foreground object observed by the computer vision engine in the sequence of video frames; parsing the data streams to identify data inputs matching an input layer of one of a plurality of adaptive resonance theory (ART) networks, wherein each ART network is configured to generate clusters from the data inputs matching the input layer of a respective ART network, and wherein each cluster provides a statistical distribution of a characteristic of the scene derived from the data streams that has been observed to occur at a location in the scene corresponding to a location of the cluster; passing the data inputs to the ART network with the matching input layer; updating the generated clusters in the ART network with the matching input layer; evaluating the clusters of the ART network with the matching input layer to determine whether the data inputs passed to the ART network are indicative of an occurrence of a statistically relevant event, relative to the clusters in the ART network with the matching input layer; and upon determining that a first one of the clusters in a first one of the plurality of ART networks has not been updated for a specified period of time, removing the first cluster from the first ART network. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A system, comprising:
-
a video input source configured to provide a sequence of video frames, each depicting a scene; a processor; and a memory containing a program, which, when executed on the processor is configured to perform an operation for analyzing the scene, as depicted by the sequence of video frames captured by the video input source, the operation comprising; receiving one or more data streams generated from the sequence of video frames, wherein a first one of the data streams provides a stream of context events generated by a computer vision engine, and wherein each context event provides kinematic data related to a foreground object observed by the computer vision engine in the sequence of video frames, parsing the data streams to identify data inputs matching an input layer of one of a plurality of adaptive resonance theory (ART) networks, wherein each ART network is configured to generate clusters from the data inputs matching the input layer of a respective ART network, and wherein each cluster provides a statistical distribution of a characteristic of the scene derived from the data streams that has been observed to occur at a location in the scene corresponding to a location of the cluster, passing the data inputs to the ART network with the matching input layer, updating the generated clusters in the ART network with the matching input layer, evaluating the clusters of the ART network with the matching input layer to determine whether the data inputs passed to the ART network are indicative of an occurrence of a statistically relevant event, relative to the clusters in the ART network with the matching input layer, and upon determining that a first one of the clusters in a first one of the plurality of ART networks has not been updated for a specified period of time, removing the first cluster from the first ART network. - View Dependent Claims (18, 19, 20, 21)
-
Specification