System and method for three-dimensional (3D) object detection

US 10,839,234 B2
Filed: 09/12/2018
Issued: 11/17/2020
Est. Priority Date: 09/12/2018
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a data processor; and

a 3D image processing system, executable by the data processor, the image processing system being configured to;

receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame;

use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame;

use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object;

obtain geological information related to a particular environment associated with the image frame;

obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and

determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for three-dimensional (3D) object detection is disclosed. A particular embodiment can be configured to: receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame; use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame; use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object; use a fitting module to obtain geological information related to a particular environment associated with the image frame and to obtain camera calibration information associated with the at least one camera; and use the fitting module to determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information.

108 Citations

18 Claims

1. A system comprising:
- a data processor; and
  
  a 3D image processing system, executable by the data processor, the image processing system being configured to;
  
  receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame;
  
  use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame;
  
  use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object;
  
  obtain geological information related to a particular environment associated with the image frame;
  
  obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and
  
  determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The system of claim 1 being further configured to provide the 3D attributes of the object to an autonomous driving perception system.
  - 3. The system of claim 1 wherein the at least one camera comprises a camera lens of a type from the group consisting of:
    - a wide-angle or close-range lens, a medium-range lens, and a long-range lens.
  - 4. The system of claim 1 wherein the vertices of the three-dimensional (3D) bounding box around the object are determined in pixel coordinates.
  - 5. The system of claim 1 wherein the geological information is obtained from a terrain map comprising global positioning system (GPS) locations with the height of the terrain.

6. A method comprising:
- receiving image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame;
  
  using a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame;
  
  using the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object;
  
  obtaining geological information related to a particular environment associated with the image frame;
  
  obtaining camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and
  
  determining 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object.
- View Dependent Claims (7, 8, 9, 10, 11, 12)
- - 7. The method of claim 6 wherein the 3D bounding box around the object is a cuboid in a 3D space.
  - 8. The method of claim 6 wherein at least one of the 2D bounding box and the 3D bounding box is determined by deep learning method.
  - 9. The method of claim 8 wherein the at least one of the 2D bounding box and the 3D bounding box is refined using Non-Maximum Suppression (NMS) method.
  - 10. The method of claim 6 wherein the geological information is obtained from at least one of a global positioning system (GPS), an accelerometer, and a WiFi triangulation.
  - 11. The method of claim 6 wherein the camera extrinsic matrix denotes coordinate system transformations from 3D world coordinates to 3D camera coordinates.
  - 12. The method of claim 6 wherein the camera intrinsic matrix denotes coordinate system transformations from 3D camera coordinates to 2D image coordinates.

13. A non-transitory machine-useable storage medium embodying instructions which, when executed by a machine, cause the machine to:
- receive image data from at least one camera associated with an autonomous vehicle, the image data representing at least one image frame;
  
  use a trained deep learning module to determine pixel coordinates of a two-dimensional (2D) bounding box around an object detected in the image frame;
  
  use the trained deep learning module to determine vertices of a three-dimensional (3D) bounding box around the object;
  
  obtain geological information related to a particular environment associated with the image frame;
  
  obtain camera calibration information associated with the at least one camera, wherein the camera calibration information comprises camera calibration matrices with a camera extrinsic matrix and a camera intrinsic matrix; and
  
  determine 3D attributes of the object using the 3D bounding box, the geological information, and the camera calibration information, wherein the 3D attributes of the object comprise a length, height, width, 3D spatial location, and heading of the object.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The non-transitory machine-useable storage medium of claim 13 wherein the object is over 200 meters away from the autonomous vehicle.
  - 15. The non-transitory machine-useable storage medium of claim 13 further configured to receive point cloud data from a laser range finder or a LIDAR associated with the autonomous vehicle.
  - 16. The non-transitory machine-useable storage medium of claim 13 wherein the 3D bounding box has eight corners.
  - 17. The non-transitory machine-useable storage medium of claim 13 wherein the geological information is obtained from a terrain map with the height of the terrain.
  - 18. The non-transitory machine-useable storage medium of claim 13 wherein the length, height, and width have predefined bounds greater than 1 meter and less and 50 meters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TuSimple, Inc. (TuSimple Holdings, Inc.)
Original Assignee
TuSimple, Inc. (TuSimple Holdings, Inc.)
Inventors
Wang, Panqu
Primary Examiner(s)
Azarian, Seyed H

Application Number

US16/129,040
Publication Number

US 20200082180A1
Time in Patent Office

797 Days
Field of Search

382100, 382103-108, 382153-155, 382157, 382168, 382173, 382181, 382189, 382199, 382209, 382219, 382224, 382232, 382254, 382274, 382276, 382285-291, 382305, 382312, 1 1, 348113, 348116
US Class Current
CPC Class Codes

G06F 16/29   Geographical information da...

G06N 20/00   Machine learning

G06T 7/62   of area, perimeter, diamete...

G06T 7/80   Analysis of captured images...

G06V 20/56   exterior to a vehicle by us...

G06V 20/58   Recognition of moving objec...

G06V 20/647   by matching two-dimensional...

System and method for three-dimensional (3D) object detection

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

108 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for three-dimensional (3D) object detection

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

108 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links