System and method for three-dimensional (3D) object detection
First Claim
1. A system comprising:
- a data processor; and
a 3D image processing system, executable by the data processor, the image processing system being configured to;
receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame;
use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame;
use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object;
obtain geological information related to a particular environment associated with the image frame;
obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and
determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for three-dimensional (3D) object detection is disclosed. A particular embodiment can be configured to: receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame; use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame; use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object; use a fitting module to obtain geological information related to a particular environment associated with the image frame and to obtain camera calibration information associated with the at least one camera; and use the fitting module to determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information.
108 Citations
18 Claims
-
1. A system comprising:
-
a data processor; and a 3D image processing system, executable by the data processor, the image processing system being configured to; receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame; use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame; use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object; obtain geological information related to a particular environment associated with the image frame; obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
receiving image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame; using a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame; using the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object; obtaining geological information related to a particular environment associated with the image frame; obtaining camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and determining 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory machine-useable storage medium embodying instructions which, when executed by a machine, cause the machine to:
-
receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame; use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame; use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object; obtain geological information related to a particular environment associated with the image frame; obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification