Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
First Claim
1. A computer-implemented method for analyzing a scene depicted in an input stream of video frames captured by a video camera of a video surveillance system, the method comprising:
- retrieving a first sequence and a second sequence, each providing an ordered string of labels, wherein each label corresponds to a cluster in an adaptive resonance theory (ART) network, wherein the strings of labels are generated by mapping kinematic data vectors generated for a first foreground object and a second foreground object detected in the input stream of video frames, respectively, to nodes of a self-organizing map (SOM) and clustering the nodes of the SOM using the ART network, and wherein the first sequence and the second sequence correspond to an observed interaction between the first foreground object and the second foreground object;
identifying one or more segments in each of the first and second sequences, wherein each segment includes a subsequence of the ordered string of labels in the first and second sequences;
determining a probability of observing the interaction between the first foreground object and the second foreground object, relative to a probability distribution generated from an ngram trie, wherein the ngram trie is generated from a plurality of previously observed sequences, each storing an ordered string of labels assigned to clusters in the ART network for objects detected in the input stream of video frames; and
upon determining the probability the observed interaction between the first foreground object and the second foreground object falls below a specified threshold, issuing an alert to a user of the video surveillance system.
6 Assignments
0 Petitions
Accused Products
Abstract
A sequence layer in a machine-learning engine configured to learn from the observations of a computer vision engine. In one embodiment, the machine-learning engine uses the voting experts to segment adaptive resonance theory (ART) network label sequences for different objects observed in a scene. The sequence layer may be configured to observe the ART label sequences and incrementally build, update, and trim, and reorganize an ngram trie for those label sequences. The sequence layer computes the entropies for the nodes in the ngram trie and determines a sliding window length and vote count parameters. Once determined, the sequence layer may segment newly observed sequences to estimate the primitive events observed in the scene as well as issue alerts for inter-sequence and intra-sequence anomalies.
-
Citations
24 Claims
-
1. A computer-implemented method for analyzing a scene depicted in an input stream of video frames captured by a video camera of a video surveillance system, the method comprising:
-
retrieving a first sequence and a second sequence, each providing an ordered string of labels, wherein each label corresponds to a cluster in an adaptive resonance theory (ART) network, wherein the strings of labels are generated by mapping kinematic data vectors generated for a first foreground object and a second foreground object detected in the input stream of video frames, respectively, to nodes of a self-organizing map (SOM) and clustering the nodes of the SOM using the ART network, and wherein the first sequence and the second sequence correspond to an observed interaction between the first foreground object and the second foreground object; identifying one or more segments in each of the first and second sequences, wherein each segment includes a subsequence of the ordered string of labels in the first and second sequences; determining a probability of observing the interaction between the first foreground object and the second foreground object, relative to a probability distribution generated from an ngram trie, wherein the ngram trie is generated from a plurality of previously observed sequences, each storing an ordered string of labels assigned to clusters in the ART network for objects detected in the input stream of video frames; and upon determining the probability the observed interaction between the first foreground object and the second foreground object falls below a specified threshold, issuing an alert to a user of the video surveillance system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable medium containing a program, which when executed on a processor, performs an operation for analyzing a scene depicted in an input stream of video frames captured by a video camera of a video surveillance system, the operation comprising:
-
retrieving a first sequence and a second sequence, each providing an ordered string of labels, wherein each label corresponds to a cluster in an adaptive resonance theory (ART) network, wherein the strings of labels are generated by mapping kinematic data vectors generated for a first foreground object and a second foreground object detected in the input stream of video frames, respectively, to nodes of a self-organizing map (SOM) and clustering the nodes of the SOM using the ART network, and wherein the first sequence and the second sequence correspond to an observed interaction between the first foreground object and the second foreground object; identifying one or more segments in each of the first and second sequences, wherein each segment includes a subsequence of the ordered string of labels in the first and second sequences; determining a probability of an observed interaction between the first foreground object and the second foreground object, relative to a probability distribution generated from an ngram trie, wherein the ngram trie is generated from a plurality of previously observed sequences, each storing an ordered string of labels assigned to clusters in the ART network for objects detected in the input stream of video frames; and upon determining the probability the observed interaction between the first foreground object and the second foreground object falls below a specified threshold, issuing an alert to a user of the video surveillance system. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A video surveillance system, comprising:
-
a video input source configured to provide an input stream of video frames captured by a video camera, each depicting a scene; a processor; and a memory containing a program, which, when executed on the processor is configured to perform an operation for analyzing the scene depicted in the input stream of video frames, the operation comprising; retrieving a first sequence and a second sequence, each providing an retrieving a first sequence and a second sequence, each providing an ordered string of labels, wherein each label corresponds to a cluster in an adaptive resonance theory (ART) network, wherein the strings of labels are generated by mapping kinematic data vectors generated for a first foreground object and a second foreground object detected in the input stream of video frames, respectively, to nodes of a self-organizing map (SOM) and clustering the nodes of the SOM using the ART network, and wherein the first sequence and the second sequence correspond to an observed interaction between the first foreground object and the second foreground object, identifying one or more segments in each of the first and second sequences, wherein each segment includes a subsequence of the ordered string of labels in the first and second sequences, determining a probability of an observed interaction between the first foreground object and the second foreground object, relative to a probability distribution generated from an ngram trie, wherein the ngram trie is generated from a plurality of previously observed sequences, each storing an ordered string of labels assigned to clusters in the ART network for objects detected in the input stream of video frames, and upon determining the probability the observed interaction between the first foreground object and the second foreground object falls below a specified threshold, issuing an alert to a user of the video surveillance system. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification