Clustering-based object classification
First Claim
1. A method for identifying objects in video content comprising:
- receiving video content of a scene captured by a video camera;
detecting an object in the video content;
identifying a track that the object follows over a series of frames of the video content;
extracting object features for the object from the video content; and
classifying the object based on the object features, wherein classifying the object further comprises;
determining a track-level classification for the object using spatially invariant object features including an aspect ratio and a directional aspect ratio associated with the object byconstructing directional clusters associated with the aspect ratio for the object,constructing directional clusters associated with the directional aspect ratio for the object,determining the track-level classification for the object based on the directional clusters associated with the aspect ratio and the directional clusters associated with directional aspect ratio, andupdating a histogram of track-level classification results for the tracked object based on the track-level classification;
determining a global-clustering classification for the object using spatially variant features; and
determining an object type for the object based on the track-level classification and the global-clustering classification for the object.
3 Assignments
0 Petitions
Accused Products
Abstract
An example of a method for identifying objects in video content according to the disclosure includes receiving video content of a scene captured by a video camera, detecting an object in the video content, identifying a track that the object follows over a series of frames of the video content, extracting object features for the object from the video content, and classifying the object based on the object features. Classifying the object further comprises: determining a track-level classification for the object using spatially invariant object features, determining a global-clustering classification for the object using spatially variant features, and determining an object type for the object based on the track-level classification and the global-clustering classification for the object.
44 Citations
20 Claims
-
1. A method for identifying objects in video content comprising:
-
receiving video content of a scene captured by a video camera; detecting an object in the video content; identifying a track that the object follows over a series of frames of the video content; extracting object features for the object from the video content; and classifying the object based on the object features, wherein classifying the object further comprises; determining a track-level classification for the object using spatially invariant object features including an aspect ratio and a directional aspect ratio associated with the object by constructing directional clusters associated with the aspect ratio for the object, constructing directional clusters associated with the directional aspect ratio for the object, determining the track-level classification for the object based on the directional clusters associated with the aspect ratio and the directional clusters associated with directional aspect ratio, and updating a histogram of track-level classification results for the tracked object based on the track-level classification; determining a global-clustering classification for the object using spatially variant features; and determining an object type for the object based on the track-level classification and the global-clustering classification for the object. - View Dependent Claims (2)
-
-
3. A method for identifying objects in video content comprising:
-
receiving video content of a scene captured by a video camera; detecting an object in the video content; identifying a track that the object follows over a series of frames of the video content; extracting object features for the object from the video content; and classifying the object based on the object features, wherein classifying the object further comprises; determining a track-level classification for the object using spatially invariant object features; determining a global-clustering classification for the object using spatially variant features including the size of the object; and determining an object type for the object based on the track-level classification and the global-clustering classification for the object, wherein determining the global-clustering classification for the object further comprises; updating local models of object size for locations visited by a persistently tracked object; and updating global clusters by associating local models with the global clusters, the local models having an object size matching that associated with the global cluster and are visited by the persistently tracked object. - View Dependent Claims (4)
-
-
5. A method for identifying objects in video content comprising:
-
receiving video content of a scene captured by a video camera; detecting an object in the video content; identifying a track that the object follows over a series of frames of the video content; extracting object features for the object from the video content; and classifying the object based on the object features, wherein classifying the object further comprises; determining a track-level classification for the object using spatially invariant object features; determining a global-clustering classification for the object using spatially variant features; determining whether the object has moved consistently in one direction for at least a predetermined threshold distance; if the object has moved more than the predetermined threshold distance, determining an object type for the object based on the track-level classification and the global-clustering classification for the object, and if the object has not moved more than the predetermined threshold distance, determining the object type for the object based on the global-clustering classification and not the track-level classification of the object.
-
-
6. A surveillance system comprising a server configured to identify objects in video content captured by a video camera, the system comprising:
-
means for receiving video content of a scene captured by a video camera; means for detecting an object in the video content; means for identifying a track that the object follows over a series of frames of the video content; means for extracting object features for the object from the video content; and means for classifying the object based on the object features, wherein classifying the object further comprises; means for determining a track-level classification for the object using spatially invariant object features including an aspect ratio and a directional aspect ratio associated with the object, the means for determining the track-level classification comprising means for constructing directional clusters associated with the aspect ratio for the object, means for constructing directional clusters associated with the directional aspect ratio for the object, means for determining the track-level classification for the object based on the directional clusters associated with the aspect ratio and the directional clusters associated with directional aspect ratio, and means for updating a histogram of track-level classification results for the tracked object based on the track-level classification; means for determining a global-clustering classification for the object using spatially variant features; and means for determining an object type for the object based on the track-level classification and the global-clustering classification for the object. - View Dependent Claims (7)
-
-
8. A surveillance system comprising a server configured to identify objects in video content captured by a video camera, the system comprising:
-
means for receiving video content of a scene captured by a video camera; means for detecting an object in the video content; means for identifying a track that the object follows over a series of frames of the video content; means for extracting object features for the object from the video content; and means for classifying the object based on the object features, wherein classifying the object further comprises; means for determining a track-level classification for the object using spatially invariant object features; means for determining a global-clustering classification for the object using spatially variant features; and means for determining an object type for the object based on the track-level classification and the global-clustering classification for the object, wherein the means for determining the global-clustering classification for the object further comprises; means for updating local models of object size for locations visited by a persistently tracked object; means for updating global clusters by associating local models with the global clusters, the local models having an object size matching that associated with the global cluster and are visited by the persistently tracked object. - View Dependent Claims (9)
-
-
10. A surveillance system comprising a server configured to identify objects in video content captured by a video camera, the system comprising:
-
means for receiving video content of a scene captured by a video camera; means for detecting an object in the video content; means for identifying a track that the object follows over a series of frames of the video content; means for extracting object features for the object from the video content; and means for classifying the object based on the object features, wherein classifying the object further comprises means for determining a track-level classification for the object using spatially invariant object features, means for determining a global-clustering classification for the object using spatially variant features, means for determining whether the object has moved consistently in one direction for at least a predetermined threshold distance; means for determining an object type for the object based on the track-level classification and the global-clustering classification for the object if the object has moved more than the predetermined threshold distance; and means for determining the object type for the object based on the global-clustering classification and not the track-level classification of the object if the object has not moved more than the predetermined threshold distance.
-
-
11. A surveillance system for identifying objects in video content captured by a video camera, the system comprising:
-
a non-transitory computer-readable memory; a plurality of modules comprising processor executable code stored in the memory; a processor connected to the memory and configured to access the plurality of modules stored in the memory; and a video processing module configured to; receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein to classify the object the video processing module is further configured to; determine a track-level classification for the object using spatially invariant object features including an aspect ratio and a directional aspect ratio associated with the object, the video processing module being configured to construct directional clusters associated with the aspect ratio for the object, construct directional clusters associated with the directional aspect ratio for the object, determine the track-level classification for the object based on the directional clusters associated with the aspect ratio and the directional clusters associated with directional aspect ratio, and update a histogram of track-level classification results for the tracked object based on the track-level classification; determine a global-clustering classification for the object using spatially variant features; and determine an object type for the object based on the track-level classification and the global-clustering classification for the object. - View Dependent Claims (12)
-
-
13. A surveillance system for identifying objects in video content captured by a video camera, the system comprising:
-
a non-transitory computer-readable memory; a plurality of modules comprising processor executable code stored in the memory; a processor connected to the memory and configured to access the plurality of modules stored in the memory; and a video processing module configured to; receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein to classify the object the video processing module is further configured to; determine a track-level classification for the object using spatially invariant object features; determine a global-clustering classification for the object using spatially variant features; and determine an object type for the object based on the track-level classification and the global-clustering classification for the object, wherein the video processing module being configured to determine the global-clustering classification for the object is further configured to; update local models of object size for locations visited by a persistently tracked object; update global clusters by associating local models with the global clusters, the local models having an object size matching that associated with the global cluster and are visited by the persistently tracked object. - View Dependent Claims (14)
-
-
15. A surveillance system for identifying objects in video content captured by a video camera, the system comprising:
-
a non-transitory computer-readable memory; a plurality of modules comprising processor executable code stored in the memory; a processor connected to the memory and configured to access the plurality of modules stored in the memory; and a video processing module configured to; receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein to classify the object the video processing module is further configured to; determine a track-level classification for the object using spatially invariant object features; determine a global-clustering classification for the object using spatially variant features; determine whether the object has moved consistently in one direction for at least a predetermined threshold distance; determine an object type for the object based on the track-level classification and the global-clustering classification for the object if the object has moved more than the predetermined threshold distance; and determine the object type for the object based on the global-clustering classification and not the track-level classification of the object if the object has not moved more than the predetermined threshold distance.
-
-
16. A non-transitory computer-readable medium, having stored thereon computer-readable instructions identifying objects in video content, comprising instructions configured to cause a computer to:
-
receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein the instruction to cause the computer to classify the object further comprise instructions to cause the computer to; determine a track-level classification for the object using spatially invariant object features, the code to cause the computer to determine the track-level classification for the object further comprises code to cause the computer to construct directional clusters associated with the aspect ratio for the tracked object, construct directional clusters associated with directional aspect ratio for the tracked object, determine the track-level classification for the object based on the directional clusters associated with the aspect ratio and the directional clusters associated with directional aspect ratio, and update a histogram of track-level classification results for the tracked object based on the track-level classification; determine a global-clustering classification for the object using spatially variant features; and determine an object type for the object based on the track-level classification and the global-clustering classification for the object. - View Dependent Claims (17)
-
-
18. A non-transitory computer-readable medium, having stored thereon computer-readable instructions identifying objects in video content, comprising instructions configured to cause a computer to:
-
receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein the instruction to cause the computer to classify the object further comprise instructions to cause the computer to; determine a track-level classification for the object using spatially invariant object features; determine a global-clustering classification for the object using spatially variant features; and determine an object type for the object based on the track-level classification and the global-clustering classification for the object, wherein the code to cause the computer to determine the global-clustering classification for the object further comprises code to cause the computer to; update local models of object size for locations visited by a persistently tracked object; and update global clusters by associating local models with the global clusters, the local models having an object size matching that associated with the global cluster and are visited by the persistently tracked object. - View Dependent Claims (19)
-
-
20. A non-transitory computer-readable medium, having stored thereon computer-readable instructions identifying objects in video content, comprising instructions configured to cause a computer to:
-
receive video content of a scene captured by a video camera; detect an object in the video content; identify a track that the object follows over a series of frames of the video content; extract object features for the object from the video content; and classify the object based on the object features, wherein the instruction to cause the computer to classify the object further comprise instructions to cause the computer to; determine a track-level classification for the object using spatially invariant object features; determine a global-clustering classification for the object using spatially variant features; determine whether the object has moved consistently in one direction for at least a predetermined threshold distance; determine an object type for the object based on the track-level classification and the global-clustering classification for the object; and determine the object type for the object based on the global-clustering classification and not the track-level classification of the object if the object has not moved more than the predetermined threshold distance.
-
Specification