Object identification and labeling tool for training autonomous vehicle controllers

US 10,169,680 B1
Filed: 02/27/2018
Issued: 01/01/2019
Est. Priority Date: 12/21/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:

displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view;

receiving, via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view;

generating, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;

obtaining an indication of a particular label for the particular object, the particular label uniquely identifying the particular object;

generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;

storing an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;

receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface;

receiving, via the one or more user controls, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view;

generating, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and

updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for identifying and labeling distinct objects within 3-D images of environments in which vehicles operate, to thereby generate training data used to train models that autonomously control and/or operate vehicles, are disclosed. A 3-D image may be presented from various perspective views (in some cases, dynamically), and/or may be presented with a corresponding 2-D environment image in a side-by-side and/or a layered manner, thereby allowing a user to more accurately identify groups/clusters of data points within the 3-D image that represent distinct objects. Automatic identification/delineation of various types of objects depicted within 3-D images, automatic labeling of identified/delineated objects, and automatic tracking of objects across various frames of a 3-D video are disclosed. A user may modify and/or refine any automatically generated information. Further, at least some of the techniques described herein are equally applicable to 2-D images.

99 Citations

View as Search Results

29 Claims

1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
- displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view;
  
  receiving, via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view;
  
  generating, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;
  
  obtaining an indication of a particular label for the particular object, the particular label uniquely identifying the particular object;
  
  generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;
  
  storing an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
  
  receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface;
  
  receiving, via the one or more user controls, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view;
  
  generating, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and
  
  updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24)
- - 2. The computer-implemented method of claim 1, wherein:
    - the 3-D environment image consists of a set of data points; and
      
      generating, based on the graphical representation, the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view comprises determining, based on the graphical representation, a subset of the set of data points of the 3-D environment image as being included in the 3-D image of the particular object.
  - 3. The computer-implemented method of claim 2, wherein storing the indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image comprises storing data indicative of the particular label uniquely identifying the particular object in association with each data point included in the subset of data points.
  - 4. The computer-implemented method of claim 2, wherein the set of data points includes point cloud data generated by one or more active sensing systems or devices.
  - 5. The computer-implemented method of claim 4, wherein the one or more active sensing systems or devices include one or more lidar devices.
  - 6. The computer-implemented method of claim 1, wherein displaying the 3-D environment image on the user interface comprises displaying a 3-D image of the environment that has been at least partially generated by image processing a two-dimensional (2-D) image of the environment.
  - 7. The computer-implemented method of claim 1, wherein obtaining the indication of the particular label uniquely identifying the particular object comprises receiving the indication of the particular label uniquely identifying the particular object via the user interface.
  - 8. The computer-implemented method of claim 1, wherein obtaining the indication of the particular label uniquely identifying the particular object comprises automatically generating the particular label uniquely identifying the particular object the particular object based upon the data indicative of the boundary of the particular object within the 3-D environment image.
  - 9. The computer-implemented method of claim 8, wherein automatically generating the particular label uniquely identifying the particular object based upon the data indicative of the boundary of the particular object within the 3-D environment image comprises providing at least some of the data indicative of the boundary of the particular object within the 3-D environment image as input into a label prediction model that has been trained based on identified and labeled objects depicted within a plurality of historical images of one or environments in which vehicles operate.
  - 10. The computer-implemented method of claim 1, wherein:
    - the method further comprises automatically generating and displaying an interim graphical representation of the boundary of the particular object within the 3-D environment image from the second perspective view based on the graphical representation of the boundary of the particular object on the 3-D environment image from the first perspective view; and
      
      receiving, via the one or more user controls, the indication of the refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view comprises receiving, via the one or more user controls, a modification to the interim graphical representation and applying the modification to the interim graphical representation, thereby generating the refinement to the graphical representation.
  - 11. The computer-implemented method of claim 10, wherein automatically generating the interim graphical representation comprises automatically determining an initial modification to the graphical representation by utilizing a boundary prediction model that has been trained based on identified objects depicted within a plurality of historical images of one or more environments in which vehicles operate, and applying the initial modification to the graphical representation, thereby generating the interim graphical representation.
  - 12. The computer-implemented method of claim 1, wherein adjusting the presentation of the 3-D environment image to be from the second perspective view comprises at least one of rotating, scaling, or translating, in three dimensions, the 3-D environment image displayed on the user interface.
  - 13. The computer-implemented method of claim 1, wherein:
    - the 3-D environment image is included in a virtual reality representation of the environment presented via the user interface; and
      
      receiving the instruction to present the 3-D environment image on the user interface from the second perspective view and adjusting the presentation of the 3-D environment image on the user interface based on the received view perspective instruction comprises receiving, via the user interface, one or more user interactions with the virtual reality representation of the environment and automatically responding to the one or more user interactions.
  - 14. The computer-implemented method of claim 1, wherein receiving, via the one or more user controls, the indication of the graphical representation of the boundary of the particular object comprises receiving, based on one or more activations of the one or more user controls, an indication at least one of:
    - a shape, a surface area of the particular object, or two points defining endpoints of at least a portion of the boundary of the particular object.
  - 15. The computer-implemented method of claim 14, wherein the method further comprises automatically determining the graphical representation based on the one or more activations of the one or more user controls, including utilizing a boundary prediction model that has been trained based on identified objects depicted within a plurality of historical images of one or more environments in which vehicles operate.
  - 16. The computer-implemented method of claim 1, wherein receiving, via the one or more user controls, the indication of the graphical representation of the boundary of the particular object comprises:
    - receiving, via the one or more user controls, a user selection of the particular object;
      
      based upon the user selection of the particular object, automatically generating and displaying, on the user interface, an interim graphical representation of the boundary of the particular object corresponding to the graphical representation; and
      
      one of;
      
      (i) receiving, via the user interface, an indication of an acceptance of the interim graphical representation to thereby generate the graphical representation, or(ii) receiving, via the user interface, a user modification to the interim graphical representation of the boundary of the particular object; and
      
      updating, based on the user modification, the display of the interim graphical representation of the boundary of the particular object on the user interface to thereby generate the graphical representation.
  - 17. The computer-implemented method of claim 1, wherein:
    - the 3-D environment image is a first frame depicting the environment at a first time;
      
      the method further comprises displaying, on the user interface, a second frame including a 3-D image depicting the environment at a second time different than the first time, the second frame including a respective 3-D image of the particular object, and the second frame including an interim graphical representation of the boundary of the particular object as depicted within the second frame, the interim graphical representation generated based on the graphical representation of the boundary of the particular object or the refinement to the graphical representation of the boundary of the particular object as depicted within the first frame; and
      
      receiving a modification to the interim graphical representation of the boundary of the particular object displayed in the second frame, and based upon the received modification;
      
      altering, on the user interface, the interim graphical representation in accordance with the received modification thereby generating and displaying an altered graphical representation of the boundary of the particular object within the second frame and associated with the second time;
      
      generating, based on the altered graphical representation of the boundary of the particular object within the second frame, data indicative of the boundary of the particular object within the second frame; and
      
      storing, in the one or more tangible, non-transitory memories as another part of the training data set, an indication of an association between the particular label uniquely identifying the particular object and the data indicative of the boundary of the particular object within the second frame.
  - 18. The computer-implemented method of claim 17, wherein receiving the modification to the interim graphical representation of the boundary of particular object displayed in the second frame comprises receiving, via the one or more user controls provided by the user interface, a user modification to at least a portion of the interim graphical representation of the boundary of the particular object displayed in the second frame.
  - 19. The computer-implemented method of claim 17,further comprising automatically determining at least a portion of the modification to the interim graphical representation of the boundary of the particular object displayed in the second frame by utilizing a boundary prediction model that has been trained based on identified objects depicted within a plurality of historical images of one or more environments in which vehicles operate;
    - andwherein receiving the modification to the interim graphical representation of the boundary of the particular object displayed in the second frame comprises receiving the automatically determined at least the portion of the modification to the interim graphical representation of the boundary of the particular object displayed in the second frame.
  - 21. A computer-implemented method of claim 17, for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
    - displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, the 3-D environment image presented on the user interface from a first perspective view, and the 3-D environment image generated by one or more active imaging sensors or devices;
      
      displaying a plurality of 2-D images of the environment in conjunction with displaying the 3-D environment image on the user interface, each 2-D image of the environment depicting a different fixed perspective view of a respective at least a portion of the 3-D environment image, and the plurality of 2-D images of the environment generated by one or more passive imaging sensors, devices, or cameras;
      
      receiving, via one or more user controls provided by the user interface, an indication of a first graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view;
      
      generating, based on the first graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;
      
      obtaining an indication of a particular label for the particular object;
      
      generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;
      
      storing an indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
      
      receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view;
      
      receiving, via the one or more user controls, an indication of a second graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view;
      
      generating, based on the second graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and
      
      updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.
  - 22. The computer-implemented method of claim 21, further comprising simultaneously displaying, on the user interface, both the first graphical representation of the boundary of the particular object on the 2-D environment image and the first graphical representation of the boundary of the particular object on the 3-D environment image.
  - 23. The computer-implemented method of claim 22, further comprising at least one of:
    - causing a user manipulation of the first graphical representation of the boundary of the particular object displayed on the 2-D environment image to be automatically reflected on the 3-D environment image;
      
      orcausing a user manipulation of the first graphical representation of the boundary of the particular object displayed on the 3-D environment image to be automatically reflected on the 2-D environment image.
  - 24. The computer-implemented method of claim 21, wherein:
    - the 2-D image is layered on top of or under the 3-D image, thereby forming a composite image of at least one of the one or more physical objects located in the environment; and
      
      the indication of the first graphical representation of the boundary of the particular object is based on the composite image.

20. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
- displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view;
  
  receiving, via the user interface, an instruction to hide respective 3-D images of one or more selected objects depicted within the 3-D environment image;
  
  based on the received hiding instruction, greying out or rendering non-visible the respective 3-D images of the one or more selected objects depicted within the 3-D environment image while maintaining respective levels of visibility of respective 3-D images of other objects depicted within the 3-D environment image;
  
  receiving, via one or more user controls provided by the user interface, an indication of a first graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view, the particular object excluded from the one or more selected objects that are greyed out or rendered non-visible;
  
  generating, based on the first graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;
  
  obtaining an indication of a particular label for the particular object;
  
  generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;
  
  storing an indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
  
  receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view;
  
  receiving, via the one or more user controls, an indication of a second graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view;
  
  generating, based on the second graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and
  
  updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.

25. A system for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the system comprising:
- a user interface;
  
  one or more processors; and
  
  one or more non-transitory, tangible memories coupled to the one or more processors and storing computer executable instructions thereon that, when executed by the one or more processors, cause the system to;
  
  display, on the user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view;
  
  receive, via the user interface, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view, the graphical representation generated via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view;
  
  generate, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;
  
  obtain an indication of a particular label for the particular object, the particular label uniquely identifying the particular object;
  
  generate, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;
  
  store an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in the one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
  
  receive, via the user interface, a user instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view;
  
  based on the received view perspective user instruction, adjust a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface;
  
  receive, via the user interface, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view, the refinement generated via the one or more user controls provided by the user interface;
  
  generate, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and
  
  update, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.
- View Dependent Claims (26, 27, 28, 29)
- - 26. The system of claim 25, wherein:
    - the computer executable instructions are further executable by the one or more processors to cause the system to automatically generate and display an interim graphical representation of the boundary of the particular object within the 3-D environment image from the second perspective view; and
      
      the indication of the refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view comprises a modification to the interim graphical representation displayed within the 3-D environment image from the second perspective view, the modification received via the one or more user controls provided by the user interface.
  - 27. The system of claim 25, wherein:
    - the 3-D environment image is included in a virtual reality representation of the environment presented via the user interface; and
      
      the adjustment of the presentation of the 3-D environment image on the user interface is in response to one or more user interactions with the virtual reality representation of the environment.
  - 28. The system of claim 25, wherein:
    - the indication of the boundary of the particular object comprises a user selection of the particular object; and
      
      the computer executable instructions are further executable by the one or more processors to;
      
      cause the system to automatically generate and display an interim graphical representation of the boundary of the particular object based on the user selection of the particular object;
      
      receive, via the user interface, an indication of (i) a user acceptance of the interim graphical representation of the boundary of the particular object, or (ii) a user modification to the interim graphical representation of the boundary of the particular object; and
      
      generate the graphical representation based on the received indication of the user acceptance or the user modification.
  - 29. The system of claim 28, wherein the interim graphical representation of the boundary of the particular object is automatically generated by utilizing a boundary prediction model that has been trained based on identified objects depicted within a plurality of historical images of one or more environments in which vehicles operate.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Luminar Technologies, Inc.
Original Assignee
Luminar Technologies, Inc.
Inventors
Sachdeva, Prateek, Trofymov, Dmytro
Primary Examiner(s)
Gray, Ryan M

Application Number

US15/906,141
Time in Patent Office

308 Days
Field of Search

None
US Class Current
CPC Class Codes

G05D 1/0088   characterized by the autono...

G05D 1/0221   involving a learning process

G05D 1/0246   using a video camera in com...

G05D 1/0253   extracting relative motion ...

G06F 18/214   Generating training pattern...

G06F 18/251   of input or preprocessed data

G06F 18/41   Interactive pattern learnin...

G06F 3/011   Arrangements for interactio...

G06F 3/03543   Mice or pucks G06F3/03541 t...

G06F 3/04812   Interaction techniques base...

G06F 3/04842   Selection of displayed obje...

G06F 3/04845   for image manipulation, e.g...

G06F 3/0486   Drag-and-drop

G06N 20/00   Machine learning

G06N 3/006   based on simulated virtual ...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06N 5/04   Inference or reasoning models

G06T 11/001   Texturing; Colouring; Gener...

G06T 15/20   Perspective computation

G06T 15/205 : Image-based rendering

G06T 15/30 : Clipping

G06T 17/05 : Geographic models

G06T 19/00 : Manipulating 3D models or i...

G06T 19/003 : Navigation within 3D models...

G06T 19/20 : Editing of 3D images, e.g. ...

G06T 2200/24 : involving graphical user in...

G06T 2207/10028 : Range image; Depth image; 3...

G06T 2207/20081 : Training; Learning

G06T 2207/30256 : Lane; Road marking

G06T 2207/30261 : Obstacle

G06T 2210/12 : Bounding box

G06T 2219/004 : Annotating, labelling

G06T 2219/028 : Multiple view windows (top-...

G06T 2219/2016 : Rotation, translation, scaling

G06T 7/12 : Edge-based segmentation

G06T 7/174 : involving the use of two or...

G06V 10/7788 : the supervisor being a huma...

G06V 20/58 : Recognition of moving objec...

G06V 20/588 : Recognition of the road, e....

G06V 20/653 : by matching three-dimension...

G06V 2201/08 : Detecting or categorising v...

H04N 13/361 : Reproducing mixed stereosco...

View All

Object identification and labeling tool for training autonomous vehicle controllers

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

99 Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Object identification and labeling tool for training autonomous vehicle controllers

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

99 Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links