Object instance identification using three-dimensional spatial configuration
First Claim
1. A system for identifying object instances in a three-dimensional (3D) scene, comprising:
- a camera configured to capture an image of multiple objects at a site;
at least one hardware processor; and
a non-transitory memory device having embodied thereon program code executable by said at least one hardware processor to;
receive, from said camera, a captured image that depicts multiple objects that are physically present at the site,detect at least two objects in the image,retrieve 3D information of the site, wherein the 3D information comprises location and orientation of objects that have been previously determined to be located at the site,generate, based on the 3D information of the site, multiple candidate clusters of objects that have been previously determined to be located at the site and are of the same type as the detected objects, wherein each of the candidate clusters represents a different relative spatial configuration between the objects in the respective candidate cluster,determine a spatial configuration of the objects detected in the image, with respect to each other and to said camera,match the objects detected in the image to one of the multiple candidate clusters, by;
(a) calculating a 3D transform error between the spatial configuration of (i) the objects detected in the image and (ii) the objects on the respective candidate cluster, and(b) selecting a candidate cluster with a minimal 3D transform error as a most probable cluster,associate the objects detected in the image with the objects of the most probable cluster, andretrieve information of at least one of the objects of the most probable cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for identifying specific instances of objects in a three-dimensional (3D) scene, comprising: a camera for capturing an image of multiple objects at a site; at least one processor executable to: use a location and orientation of the camera to create a 3D model of the site including multiple instances of objects expected to be in proximity to the camera, and generate multiple candidate clusters each representing a different projection of the 3D model, detect at least two objects in the image, and determine a spatial configuration for each detected object; and match the detected image objects to one of the multiple candidate cluster using the spatial configurations, associate the detected objects with the expected object instances of the matched cluster, and retrieve information of one of the detected objects that is stored with the associated expected object instance; and a head-wearable display configured to display the information.
-
Citations
16 Claims
-
1. A system for identifying object instances in a three-dimensional (3D) scene, comprising:
-
a camera configured to capture an image of multiple objects at a site; at least one hardware processor; and a non-transitory memory device having embodied thereon program code executable by said at least one hardware processor to; receive, from said camera, a captured image that depicts multiple objects that are physically present at the site, detect at least two objects in the image, retrieve 3D information of the site, wherein the 3D information comprises location and orientation of objects that have been previously determined to be located at the site, generate, based on the 3D information of the site, multiple candidate clusters of objects that have been previously determined to be located at the site and are of the same type as the detected objects, wherein each of the candidate clusters represents a different relative spatial configuration between the objects in the respective candidate cluster, determine a spatial configuration of the objects detected in the image, with respect to each other and to said camera, match the objects detected in the image to one of the multiple candidate clusters, by; (a) calculating a 3D transform error between the spatial configuration of (i) the objects detected in the image and (ii) the objects on the respective candidate cluster, and (b) selecting a candidate cluster with a minimal 3D transform error as a most probable cluster, associate the objects detected in the image with the objects of the most probable cluster, and retrieve information of at least one of the objects of the most probable cluster. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for identifying multiple object instances, comprising:
-
capturing, by a camera, an image depicting multiple objects that are physically present at a site; detecting at least two objects in the image; retrieving three-dimensional (3D) information of the site, wherein the 3D information comprises location and orientation of objects that have been previously determined to be located at the site; generating, based on the 3D information of the site, multiple candidate clusters of objects that have been previously determined to be located at the site and are of the same type as the detected objects, wherein each of the candidate clusters represents a different relative spatial configuration between the objects in the respective candidate cluster; determining a spatial configuration of the objects detected in the image, with respect to each other and to said camera; matching the objects detected in the image to one of the multiple candidate clusters, by; (a) calculating a 3D transform error between the spatial configuration of (i) the objects detected in the image and (ii) the objects in the respective candidate cluster, and (b) selecting a candidate cluster with a minimal 3D transform error as a most probable cluster; associating the objects detected in the image with the objects of the most probable cluster; and retrieving information of at least one of the objects of the most probable cluster. - View Dependent Claims (8, 9, 13, 14)
-
-
10. A computer program product comprising a non-transitory computer-readable storage medium having program code embodied thereon, the program code executable by at least one hardware processor to:
-
receive, from a camera, a captured image that depicts multiple objects that are physically present at a site; detect at least two objects in the image; retrieve 3D information of the site, wherein the 3D information comprises location and orientation of objects that have been previously determined to be located at the site; generate, based on the 3D information of the site, multiple candidate clusters of objects that have been previously determined to be located at the site and are of the same type as the detected objects, wherein each of the candidate clusters represents a different relative spatial configuration between the objects in the respective candidate cluster; determine a spatial configuration of the objects detected in the image, with respect to each other and to the camera; match the objects detected in the image to one of the multiple candidate clusters, by; (a) calculating a 3D transform error between the spatial configuration of (i) the objects detected in the image and (ii) the objects in the respective candidate cluster, and (b) selecting a candidate cluster with a minimal 3D transform error as a most probable cluster; associate the objects detected in the image with the objects of the most probable cluster; and retrieve information of at least one of the objects of the most probable cluster. - View Dependent Claims (11, 12, 15, 16)
-
Specification