Behavioral recognition system
First Claim
1. A method for processing a stream of video frames recording events within a scene, the method comprising:
- receiving a first frame of the stream, wherein the first frame includes data for a plurality of pixels included in the frame;
identifying one or more groups of pixels in the first frame, wherein each group depicts an object within the scene;
generating a search model storing one or more features associated with each identified object;
classifying each of the objects using a trained classifier;
tracking, in a second frame, each of the objects identified in the first frame using the search model;
supplying the first frame, the second frame, and the object classifications to a machine learning engine; and
generating, by the machine learning engine, one or more semantic representations of behavior engaged in by the objects in the scene over a plurality of frames, wherein the machine learning engine is configured to learn patterns of behavior observed in the scene over the plurality of frames and to identify occurrences of the patterns of behavior engaged in by the classified objects.
6 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present invention provide a method and a system for analyzing and learning behavior based on an acquired stream of video frames. Objects depicted in the stream are determined based on an analysis of the video frames. Each object may have a corresponding search model used to track an object'"'"'s motion frame-to-frame. Classes of the objects are determined and semantic representations of the objects are generated. The semantic representations are used to determine objects'"'"' behaviors and to learn about behaviors occurring in an environment depicted by the acquired video streams. This way, the system learns rapidly and in real-time normal and abnormal behaviors for any environment by analyzing movements or activities or absence of such in the environment and identifies and predicts abnormal and suspicious behavior based on what has been learned.
-
Citations
30 Claims
-
1. A method for processing a stream of video frames recording events within a scene, the method comprising:
-
receiving a first frame of the stream, wherein the first frame includes data for a plurality of pixels included in the frame; identifying one or more groups of pixels in the first frame, wherein each group depicts an object within the scene; generating a search model storing one or more features associated with each identified object; classifying each of the objects using a trained classifier; tracking, in a second frame, each of the objects identified in the first frame using the search model; supplying the first frame, the second frame, and the object classifications to a machine learning engine; and generating, by the machine learning engine, one or more semantic representations of behavior engaged in by the objects in the scene over a plurality of frames, wherein the machine learning engine is configured to learn patterns of behavior observed in the scene over the plurality of frames and to identify occurrences of the patterns of behavior engaged in by the classified objects. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable storage medium containing a program, which, when executed on a processor is configured to perform an operation, comprising:
-
receiving a first frame of the stream, wherein the first frame includes data for a plurality of pixels included in the frame; identifying one or more groups of pixels in the first frame, wherein each group depicts an object within the scene; generating a search model storing one or more features associated with each identified object; classifying each of the objects using a trained classifier; tracking, in a second frame, each of the objects identified in the first frame using the search model; supplying the first frame, the second frame, and the object classifications to a machine learning engine; and generating, by the machine learning engine, one or more semantic representations of behavior engaged in by the objects in the scene over a plurality of frames, wherein the machine learning engine is configured to learn patterns of behavior observed in the scene over the plurality of frames and to identify occurrences of the patterns of behavior engaged in by the classified objects. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system, comprising:
-
a video input source; a processor; and a memory storing; a computer vision engine, wherein the computer vision engine is configured to; receive, from the video input source, a first frame of the stream, wherein the first frame includes data for a plurality of pixels included in the frame, identify one or more groups of pixels in the first frame, wherein each group depicts an object within the scene, generate a search model storing one or more features associated with each identified object, classify each of the objects using a trained classifier, track, in a second frame, each of the objects identified in the first frame using the search model, and supply the first frame, the second frame, and the object classifications to a machine learning engine; and the machine learning engine, wherein the machine learning engine is configured to generate one or more semantic representations of behavior engaged in by the objects in the scene over a plurality of frames and further configured to learn patterns of behavior observed in the scene over the plurality of frames and to identify occurrences of the patterns of behavior engaged in by the classified objects. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification