Real-time object analysis with occlusion handling
First Claim
1. A method, comprising the steps of:
- receiving a video sequence comprising detection results from one or more detectors, the detection results identifying one or more objects;
applying a clustering process to the detection results to identify one or more clusters associated with the one or more objects, wherein the clustering process is applied to the video sequence on a frame-by-frame basis, and wherein applying the clustering process comprises;
detecting one or more non-associated detections, wherein the one or more non-associated detections have not been assigned to an existing cluster;
applying a non-maximum suppression process to the one or more non-associated detections to generate one or more results;
evaluating the one or more results with a confidence score for the one or more non-associated detections, wherein the confidence score comprises a depth-height map and a detection frequency; and
instantiating a new cluster or eliminating one or more existing clusters based on the confidence score;
determining spatial and temporal information for each of the one or more clusters;
associating the one or more clusters to the detection results based on the spatial and temporal information in consecutive frames of the video sequence to generate tracking information;
generating one or more target tracks based on the tracking information for the one or more clusters; and
consolidating the one or more target tracks to generate refined tracks for the one or more objects;
wherein the steps are performed by at least one processor device coupled to a memory.
2 Assignments
0 Petitions
Accused Products
Abstract
A method includes the following steps. A video sequence including detection results from one or more detectors is received, the detection results identifying one or more objects. A clustering framework is applied to the detection results to identify one or more clusters associated with the one or more objects. The clustering framework is applied to the video sequence on a frame-by-frame basis. Spatial and temporal information for each of the one or more clusters are determined. The one or more clusters are associated to the detection results based on the spatial and temporal information in consecutive frames of the video sequence to generate tracking information. One or more target tracks are generated based on the tracking information for the one or more clusters. The one or more target tracks are consolidated to generate refined tracks for the one or more objects.
-
Citations
20 Claims
-
1. A method, comprising the steps of:
-
receiving a video sequence comprising detection results from one or more detectors, the detection results identifying one or more objects; applying a clustering process to the detection results to identify one or more clusters associated with the one or more objects, wherein the clustering process is applied to the video sequence on a frame-by-frame basis, and wherein applying the clustering process comprises; detecting one or more non-associated detections, wherein the one or more non-associated detections have not been assigned to an existing cluster; applying a non-maximum suppression process to the one or more non-associated detections to generate one or more results; evaluating the one or more results with a confidence score for the one or more non-associated detections, wherein the confidence score comprises a depth-height map and a detection frequency; and instantiating a new cluster or eliminating one or more existing clusters based on the confidence score; determining spatial and temporal information for each of the one or more clusters; associating the one or more clusters to the detection results based on the spatial and temporal information in consecutive frames of the video sequence to generate tracking information; generating one or more target tracks based on the tracking information for the one or more clusters; and consolidating the one or more target tracks to generate refined tracks for the one or more objects; wherein the steps are performed by at least one processor device coupled to a memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus, comprising:
-
a memory; and a processor operatively coupled to the memory and configured to; receive a video sequence comprising detection results from one or more detectors, the detection results identifying one or more objects; apply a clustering process to the detection results to identify one or more clusters associated with the one or more objects, wherein the clustering process is applied to the video sequence on a frame-by-frame basis, and wherein the application of the clustering process comprises; a detection one or more non-associated detections, wherein the one or more non-associated detections have not been assigned to an existing cluster; an application of a non-maximum suppression process to the one or more non-associated detections to generate one or more results; an evaluation of the one or more results with a confidence score for the one or more non-associated detections, wherein the confidence score comprises a depth-height map and a detection frequency; and instantiate a new cluster or eliminate one or more existing clusters based on the confidence score; determine spatial and temporal information for each of the one or more clusters; associate the one or more clusters to the detection results based on the spatial and temporal information in consecutive frames of the video sequence to generate tracking information; generate one or more target tracks based on the tracking information for the one or more clusters; and consolidate the one or more target tracks to generate refined tracks for the one or more objects. - View Dependent Claims (16, 17, 18, 19)
-
-
20. An article of manufacture comprising a computer readable storage medium for storing computer readable program code which, when executed, causes a computer to:
-
receive a video sequence comprising detection results from one or more detectors, the detection results identifying one or more objects; apply a clustering process to the detection results to identify one or more clusters associated with the one or more objects, wherein the clustering process is applied to the video sequence on a frame-by-frame basis, and wherein the application of the clustering process comprises; a detection one or more non-associated detections, wherein the one or more non-associated detections have not been assigned to an existing cluster; an application of a non-maximum suppression process to the one or more non-associated detections to generate one or more results; an evaluation the one or more results with a confidence score for the one or more non-associated detections, wherein the confidence score comprises a depth-height map and a detection frequency; and instantiate a new cluster or eliminate one or more existing clusters based on the confidence score; and
eliminate one or more existing clusters;determine spatial and temporal information for each of the one or more clusters; associate the one or more clusters to the detection results based on the spatial and temporal information in consecutive frames of the video sequence to generate tracking information; generate one or more target tracks based on the tracking information for the one or more clusters; and consolidate the one or more target tracks to generate refined tracks for the one or more objects.
-
Specification