Object identification and labeling tool for training autonomous vehicle controllers
First Claim
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
- displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view;
receiving, via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view;
generating, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view;
obtaining an indication of a particular label for the particular object, the particular label uniquely identifying the particular object;
generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image;
storing an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles;
receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface;
receiving, via the one or more user controls, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view;
generating, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and
updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.
5 Assignments
0 Petitions
Accused Products
Abstract
Techniques for identifying and labeling distinct objects within 3-D images of environments in which vehicles operate, to thereby generate training data used to train models that autonomously control and/or operate vehicles, are disclosed. A 3-D image may be presented from various perspective views (in some cases, dynamically), and/or may be presented with a corresponding 2-D environment image in a side-by-side and/or a layered manner, thereby allowing a user to more accurately identify groups/clusters of data points within the 3-D image that represent distinct objects. Automatic identification/delineation of various types of objects depicted within 3-D images, automatic labeling of identified/delineated objects, and automatic tracking of objects across various frames of a 3-D video are disclosed. A user may modify and/or refine any automatically generated information. Further, at least some of the techniques described herein are equally applicable to 2-D images.
99 Citations
29 Claims
-
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
-
displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view; receiving, via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view; generating, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view; obtaining an indication of a particular label for the particular object, the particular label uniquely identifying the particular object; generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image; storing an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles; receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface; receiving, via the one or more user controls, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view; generating, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24)
-
-
20. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously control vehicles, the method comprising:
-
displaying, on a user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view; receiving, via the user interface, an instruction to hide respective 3-D images of one or more selected objects depicted within the 3-D environment image; based on the received hiding instruction, greying out or rendering non-visible the respective 3-D images of the one or more selected objects depicted within the 3-D environment image while maintaining respective levels of visibility of respective 3-D images of other objects depicted within the 3-D environment image; receiving, via one or more user controls provided by the user interface, an indication of a first graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view, the particular object excluded from the one or more selected objects that are greyed out or rendered non-visible; generating, based on the first graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view; obtaining an indication of a particular label for the particular object; generating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image; storing an indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image in one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles; receiving, via the one or more user controls, an instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view, and based on the received view perspective instruction, adjusting a presentation of the 3-D environment image on the user interface to be from the second perspective view; receiving, via the one or more user controls, an indication of a second graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view; generating, based on the second graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and updating, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image.
-
-
25. A system for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the system comprising:
-
a user interface; one or more processors; and one or more non-transitory, tangible memories coupled to the one or more processors and storing computer executable instructions thereon that, when executed by the one or more processors, cause the system to; display, on the user interface, a three-dimensional (3-D) image of an environment in which vehicles operate, the 3-D environment image depicting one or more physical objects located in the environment, and the 3-D environment image presented on the user interface from a first perspective view; receive, via the user interface, an indication of a graphical representation of a boundary of a particular object as depicted within the 3-D environment image from the first perspective view, the graphical representation generated via one or more user controls provided by the user interface and displayed on the user interface in conjunction with the 3-D environment image from the first perspective view; generate, based on the graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view; obtain an indication of a particular label for the particular object, the particular label uniquely identifying the particular object; generate, based on the data indicative of the boundary of the particular object within the 3-D environment image from the first perspective view, an association between the particular label uniquely identifying the particular object and a 3-D image of the particular object within the 3-D environment image, thereby distinguishing the 3-D image of the particular object within the 3-D environment image; store an indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image in the one or more tangible, non-transitory memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously control vehicles; receive, via the user interface, a user instruction to present the 3-D environment image on the user interface from a second perspective view different than the first perspective view; based on the received view perspective user instruction, adjust a presentation of the 3-D environment image on the user interface to be from the second perspective view so that the 3-D environment image from the second perspective view and the graphical representation of the boundary of the particular object are displayed on the user interface; receive, via the user interface, an indication of a refinement to the graphical representation of the boundary of the particular object as depicted within the 3-D environment image from the second perspective view, the refinement generated via the one or more user controls provided by the user interface; generate, based on the refined graphical representation, data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view; and update, based on the data indicative of the boundary of the particular object within the 3-D environment image from the second perspective view, the stored indication of the association between the particular label uniquely identifying the particular object and the 3-D image of the particular object within the 3-D environment image, thereby refining the distinguishing of the 3-D image of the particular object within the 3-D environment image. - View Dependent Claims (26, 27, 28, 29)
-
Specification