MULTI-VIEW OBJECT DETECTION USING APPEARANCE MODEL TRANSFER FROM SIMILAR SCENES
First Claim
1. A method for learning a plurality of view-specific object detectors as a function of scene geometry and object motion patterns, the method comprising:
- determining via a processing unit motion directions for each of a plurality of object images that are extracted from a source training video dataset input and that each have size and motion dimension values that meet an expected criterion of an object of interest, wherein the object images are collected from each of a plurality of different camera scene viewpoints;
categorizing via the processing unit the plurality of object images into a plurality of clusters as a function of similarities of their determined motion directions, wherein the object images in each of the clusters are also acquired from one of the different camera scene viewpoints;
estimating via the processing unit zenith angles for poses of the object images in each of the clusters relative to a position of a horizon in the camera scene viewpoint from which the clustered object images are acquired, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered object images to the camera scene viewpoint from which the clustered object images are acquired; and
building via the processing unit a plurality of detectors for recognizing objects input video, one for each of the clusters of the object images, and associating each of the built detectors with the estimated zenith angles and azimuth angles of the poses of the cluster for which the detectors are built.
2 Assignments
0 Petitions
Accused Products
Abstract
View-specific object detectors are learned as a function of scene geometry and object motion patterns. Motion directions are determined for object images extracted from a training dataset and collected from different camera scene viewpoints. The object images are categorized into clusters as a function of similarities of their determined motion directions, the object images in each cluster are acquired from the same camera scene viewpoint. Zenith angles are estimated for object image poses in the clusters relative to a position of a horizon in the cluster camera scene viewpoint, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered images to the cluster camera scene viewpoint. Detectors are thus built for recognizing objects in input video, one for each of the clusters, and associated with the estimated zenith angles and azimuth angles of the poses of the respective clusters.
-
Citations
25 Claims
-
1. A method for learning a plurality of view-specific object detectors as a function of scene geometry and object motion patterns, the method comprising:
-
determining via a processing unit motion directions for each of a plurality of object images that are extracted from a source training video dataset input and that each have size and motion dimension values that meet an expected criterion of an object of interest, wherein the object images are collected from each of a plurality of different camera scene viewpoints; categorizing via the processing unit the plurality of object images into a plurality of clusters as a function of similarities of their determined motion directions, wherein the object images in each of the clusters are also acquired from one of the different camera scene viewpoints; estimating via the processing unit zenith angles for poses of the object images in each of the clusters relative to a position of a horizon in the camera scene viewpoint from which the clustered object images are acquired, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered object images to the camera scene viewpoint from which the clustered object images are acquired; and building via the processing unit a plurality of detectors for recognizing objects input video, one for each of the clusters of the object images, and associating each of the built detectors with the estimated zenith angles and azimuth angles of the poses of the cluster for which the detectors are built. - View Dependent Claims (2, 3, 4, 5, 6, 7, 14)
-
-
8. A method of providing a service for learning a plurality of view-specific object detectors as a function of scene geometry and object motion patterns, the method comprising providing:
-
a motion direction determiner that determines motion directions for each of a plurality of object images that are extracted from a source training video dataset input and that each have size and motion dimension values that meet an expected criterion of an object of interest, wherein the object images are collected from each of a plurality of different camera scene viewpoints; an object classifier that categorizes the plurality of object images into a plurality of clusters as a function of similarities of their determined motion directions, wherein the object images in each of the clusters are also acquired from one of the different camera scene viewpoints; a pose parameterizer that estimates zenith angles for poses of the object images in each of the clusters relative to a position of a horizon in the camera scene viewpoint from which the clustered object images are acquired, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered object images to the camera scene viewpoint from which the clustered object images are acquired; and an object detector modeler that builds a plurality of detectors for recognizing objects, one for each of the clusters of the object images, and associates each of the built detectors with the estimated zenith angles and azimuth angles of the poses of the cluster for which the detectors are built. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
15. A system, comprising:
-
a processing unit, a computer readable memory and a computer-readable storage medium; wherein the processing unit, when executing program instructions stored on the computer-readable storage medium via the computer readable memory; determines motion directions for each of a plurality of object images that are extracted from a source training video dataset input and that each have size and motion dimension values that meet an expected criterion of an object of interest, wherein the object images are collected from each of a plurality of different camera scene viewpoints; categorizes the plurality of object images into a plurality of clusters as a function of similarities of their determined motion directions, wherein the object images in each of the clusters are also acquired from one of the different camera scene viewpoints; estimates zenith angles for poses of the object images in each of the clusters relative to the position of the horizon in the camera scene viewpoint from which the clustered object images are acquired, and azimuth angles of the poses as a function of the determined motion directions of the clustered object images; and builds a plurality of detectors for recognizing objects input video, one for each of the clusters of the object images, and associates each of the built detectors with the estimated zenith angles and azimuth angles of the poses of the cluster for which the detectors are built. - View Dependent Claims (16, 17, 18, 19)
-
-
20. An article of manufacture, comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising instructions that, when executed by a computer processor, cause the computer processor to; determine motion directions for each of a plurality of object images that are extracted from a source training video dataset input and that each have size and motion dimension values that meet an expected criterion of an object of interest, wherein the object images are collected from each of a plurality of different camera scene viewpoints; categorize the plurality of object images into a plurality of clusters as a function of similarities of their determined motion directions, wherein the object images in each of the clusters are also acquired from one of the different camera scene viewpoints; estimate zenith angles for poses of the object images in each of the clusters relative to the position of a horizon in the camera scene viewpoint from which the clustered object images are acquired, and azimuth angles of the poses as a function of the determined motion directions of the clustered object images to the camera scene viewpoint from which the clustered object images are acquired; and build a plurality of detectors for recognizing objects input video, one for each of the clusters of the object images, and associates each of the built detectors with the estimated zenith angles and azimuth angles of the poses of the cluster for which the detectors are built. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification