Object identification and labeling tool for training autonomous vehicle controllers
First Claim
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
- presenting, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time, wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
presenting, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object;
automatically generating an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames;
receiving, via the user interface, an indication of a user modification to the interim graphical representation;
altering, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time;
generating data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and
storing, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set.
5 Assignments
0 Petitions
Accused Products
Abstract
Techniques for identifying and labeling distinct objects within 3-D images of environments in which vehicles operate, to thereby generate training data used to train models that autonomously control and/or operate vehicles, are disclosed. A 3-D image may be presented from various perspective views (in some cases, dynamically), and/or may be presented with a corresponding 2-D environment image in a side-by-side and/or a layered manner, thereby allowing a user to more accurately identify groups/clusters of data points within the 3-D image that represent distinct objects. Automatic identification/delineation of various types of objects depicted within 3-D images, automatic labeling of identified/delineated objects, and automatic tracking of objects across various frames of a 3-D video are disclosed. A user may modify and/or refine any automatically generated information. Further, at least some of the techniques described herein are equally applicable to 2-D images.
-
Citations
20 Claims
-
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
-
presenting, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time, wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles; presenting, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object; automatically generating an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames; receiving, via the user interface, an indication of a user modification to the interim graphical representation; altering, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time; generating data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and storing, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the system comprising:
-
a communication module; one or more processors; and one or more non-transitory, tangible memories coupled to the one or more processors and storing computer-executable instructions thereon that, when executed by the one or more processors, cause the system to; present, on a user interface of one or more computing devices, (i) a first frame comprising a three-dimensional (3-D) image of an environment, at a first time, in which vehicles operate, the first frame depicting one or more physical objects located in the environment, and (ii) a first graphical representation indicating a boundary of a particular object located in the environment as depicted in the first frame at the first time, wherein an association of data indicative of the boundary of the particular object as depicted within the first frame at the first time and a particular label that uniquely identifies the particular object (i) distinguishes a 3-D image of the particular object within the first frame and (ii) is stored in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles; present, on the user interface, a second frame comprising a 3-D image of the environment at a second time different than the first time, the second frame depicting at least a portion of the particular object; automatically generate an interim graphical representation of the boundary of the particular object as depicted within the second frame by inputting data indicative of the first graphical representation of the boundary of the particular object as depicted in the first frame and uniquely identified by the particular label into a boundary prediction model that has been trained based on objects that have been distinguished within a plurality of 3-D historical images of one or more environments in which vehicles operate, the plurality of 3-D historical images including time-sequenced frames; present the interim graphical representation within the second frame; receive, via the communication module, an indication of a user modification to the interim graphical representation; alter, based on the received user modification, the interim graphical representation to thereby generate a second graphical representation of the boundary of the particular object as depicted in the second frame at the second time; generate data indicative of the second graphical representation of the boundary of the particular object as depicted within the second frame; and store, in the one or more tangible, non-transitory memories, an association of the data indicative of the boundary of the particular object as depicted in the second frame at the second time and the particular label uniquely identifying the particular object as another part of the training data set. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification