Object identification and labeling tool for training autonomous vehicle controllers

US 10,275,689 B1
Filed: 02/27/2018
Issued: 04/30/2019
Est. Priority Date: 12/21/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:

presenting, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time, wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;

presenting, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object;

automatically generating an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames;

receiving, via the user interface, an indication of a user modification to the interim graphical representation;

altering, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time;

generating data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and

storing, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for identifying and labeling distinct objects within 3-D images of environments in which vehicles operate, to thereby generate training data used to train models that autonomously control and/or operate vehicles, are disclosed. A 3-D image may be presented from various perspective views (in some cases, dynamically), and/or may be presented with a corresponding 2-D environment image in a side-by-side and/or a layered manner, thereby allowing a user to more accurately identify groups/clusters of data points within the 3-D image that represent distinct objects. Automatic identification/delineation of various types of objects depicted within 3-D images, automatic labeling of identified/delineated objects, and automatic tracking of objects across various frames of a 3-D video are disclosed. A user may modify and/or refine any automatically generated information. Further, at least some of the techniques described herein are equally applicable to 2-D images.

Citations

20 Claims

1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
- presenting, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time, wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
  
  presenting, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object;
  
  automatically generating an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames;
  
  receiving, via the user interface, an indication of a user modification to the interim graphical representation;
  
  altering, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time;
  
  generating data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and
  
  storing, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The computer-implemented method of claim 1, wherein:
    - automatically generating the interim graphical representation comprises automatically predicting, based upon inputting the data indicative of the boundary of the particular object as depicted within the first frame into the boundary prediction model, that a subset of data points included in the second frame corresponds to the 3-D image of the particular object as depicted within the second frame; and
      
      wherein presenting the interim graphical representation within the second frame comprises presenting the interim graphical representation within the second frame at a location indicative of the subset of data points.
  - 3. The computer-implemented method of claim 1, wherein each of the first frame and the second frame is a respective data set generated by one or more active sensing devices or systems.
  - 4. The computer-implemented method of claim 3, wherein the one or more active sensing devices or systems include one or more lidar devices.
  - 5. The computer-implemented method of claim 3, wherein the respective data sets comprise respective point cloud datasets.
  - 6. The computer-implemented method of claim 1, wherein:
    - presenting the second frame on the user interface comprises presenting the second frame from a first perspective view on the user interface;
      
      the method further comprises receiving, via the user interface, a user instruction to present the second frame on the user interface from a second perspective view different from the first perspective view and, based on the received view perspective instruction, adjusting the presentation of the second frame and the interim graphical representation of the particular object on the user interface included therein to be from the second perspective view; and
      
      receiving the indication of the user modification to the interim graphical representation of the boundary of the particular object as depicted within the second frame comprises receiving the indication of the user modification to the interim graphical representation of the boundary of the particular object as depicted within the second perspective view of the 3-D image within the second frame.
  - 7. The computer-implemented method of claim 1, wherein receiving the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises receiving an indication of a user modification to two or more connected line segments collectively indicating the boundary of the particular object.
  - 8. The computer-implemented method of claim 1, wherein receiving the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises receiving a user modification to a visual property applied to one or more surface areas of the particular object.
  - 9. The computer-implemented method of claim 1, wherein receiving the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises receiving a user modification of two points that define endpoints of at least a portion of the boundary of the particular object.
  - 10. The method of claim 1, wherein automatically generating the interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting the data indicative of the first graphical representation of the boundary the particular object as depicted in the first frame into the boundary prediction model comprises:
    - automatically generating the interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting, into the boundary prediction model, the data indicative of the first graphical representation of the boundary the particular object as depicted in the first frame in conjunction with data indicative of data points included in the second frame.

11. A system for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the system comprising:
- a communication module;
  
  one or more processors; and
  
  one or more non-transitory, tangible memories coupled to the one or more processors and storing computer-executable instructions thereon that, when executed by the one or more processors, cause the system to;
  
  present, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time,wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
  
  present, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object;
  
  automatically generate an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames;
  
  present the interim graphical representation within the second frame;
  
  receive, via the communication module, an indication of a user modification to the interim graphical representation;
  
  alter, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time;
  
  generate data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and
  
  store, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The system of claim 11, wherein:
    - the boundary prediction model automatically generates, based upon the inputted data indicative of the first graphical representation of the boundary of the particular object and uniquely identified by the particular label, an automatic prediction that a subset of data points included in the second frame corresponds to the 3-D image of the particular object as depicted within the second frame; and
      
      wherein the interim graphical representation is presented within the second frame at a location indicative of the subset of data points.
  - 13. The system of claim 11, wherein each of the first frame and the second frame is a respective data set generated by one or more active sensing devices or systems.
  - 14. The system of claim 13, wherein the one or more active sensing devices or systems include one or more lidar devices.
  - 15. The system of claim 13, wherein the respective data sets comprise respective point cloud datasets.
  - 16. The system of claim 11, wherein:
    - the presentation of the second frame on the user interface is from a first perspective view;
      
      the computer executable instructions are further executable by the one or more processors to cause the system to;
      
      receive, via the communication module, a user instruction to present the second frame on the user interface from a second perspective view different from the first perspective view; and
      
      adjust, based on the received view perspective instruction, the presentation of the second frame and the interim graphical representation of the particular object included therein to be presented on the user interface from the second perspective view; and
      
      the altering of the interim graphical representation of the boundary of the particular object as depicted within the second frame comprises an altering of the interim graphical representation of the boundary of the particular object as depicted within the second perspective view of the 3-D image within the second frame.
  - 17. The system of claim 11, wherein the reception of the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises a reception of an indication of a user modification to two or more connected line segments collectively indicating the boundary of the particular object.
  - 18. The system of claim 11, wherein reception of the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises the reception of a user modification to a visual property applied to one or more surface areas of the particular object.
  - 19. The system of claim 11, wherein the reception of the indication of the user modification to the interim graphical representation of the boundary of the particular object comprises a reception of a user modification of two points that define endpoints of at least a portion of the boundary of the particular object.
  - 20. The system of claim 11, wherein the data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label is input into the boundary prediction model in conjunction with data indicative of data points included the second frame to automatically generate the interim graphical representation of the boundary of the particular object as depicted within the second frame.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Luminar Technologies, Inc.
Original Assignee
Luminar Technologies, Inc.
Inventors
Sachdeva, Prateek, Trofymov, Dmytro
Primary Examiner(s)
Tung, Kee M
Assistant Examiner(s)
Li, Grace Q

Application Number

US15/906,443
Time in Patent Office

427 Days
Field of Search
US Class Current
CPC Class Codes

G05D 1/0088   characterized by the autono...

G05D 1/0221   involving a learning process

G05D 1/0246   using a video camera in com...

G05D 1/0253   extracting relative motion ...

G06F 18/214   Generating training pattern...

G06F 18/251   of input or preprocessed data

G06F 18/41   Interactive pattern learnin...

G06F 3/011   Arrangements for interactio...

G06F 3/03543   Mice or pucks G06F3/03541 t...

G06F 3/04812   Interaction techniques base...

G06F 3/04842   Selection of displayed obje...

G06F 3/04845   for image manipulation, e.g...

G06F 3/0486   Drag-and-drop

G06N 20/00   Machine learning

G06N 3/006   based on simulated virtual ...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06N 5/04   Inference or reasoning models

G06T 11/001   Texturing; Colouring; Gener...

G06T 15/20   Perspective computation

G06T 15/205 : Image-based rendering

G06T 15/30 : Clipping

G06T 17/05 : Geographic models

G06T 19/00 : Manipulating 3D models or i...

G06T 19/003 : Navigation within 3D models...

G06T 19/20 : Editing of 3D images, e.g. ...

G06T 2200/24 : involving graphical user in...

G06T 2207/10028 : Range image; Depth image; 3...

G06T 2207/20081 : Training; Learning

G06T 2207/30256 : Lane; Road marking

G06T 2207/30261 : Obstacle

G06T 2210/12 : Bounding box

G06T 2219/004 : Annotating, labelling

G06T 2219/028 : Multiple view windows (top-...

G06T 2219/2016 : Rotation, translation, scaling

G06T 7/12 : Edge-based segmentation

G06T 7/174 : involving the use of two or...

G06V 10/7788 : the supervisor being a huma...

G06V 20/58 : Recognition of moving objec...

G06V 20/588 : Recognition of the road, e....

G06V 20/653 : by matching three-dimension...

G06V 2201/08 : Detecting or categorising v...

H04N 13/361 : Reproducing mixed stereosco...

View All

Object identification and labeling tool for training autonomous vehicle controllers

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Object identification and labeling tool for training autonomous vehicle controllers

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links