Object identification and labeling tool for training autonomous vehicle controllers
First Claim
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
- displaying, on a first display area of a user interface, a three-dimensional (3-D) video of an environment in which vehicles operate, the 3-D environment video including respective 3-D images of one or more physical objects located in the environment;
displaying, on a second display area of the user interface and in a time-synchronized manner with the 3-D video, a two-dimensional (2-D) video of at least a portion of the environment depicted in the 3-D environment video so that an image of the environment that was obtained at a particular time and that is included in the 2-D video is presented simultaneously on the user interface with an image of the environment that was obtained at the particular time and that is included in the 3-D video;
receiving, via one or more user controls provided by the user interface, an indication of a boundary of a particular physical object depicted within a 3-D environment image included in the 3-D video;
generating data indicative of the boundary of the particular physical object within the 3-D environment image;
receiving an indication of a particular label for the particular physical object;
associating the particular label for the particular physical object with the data indicative of the boundary of the particular physical object within the 3-D environment image, thereby distinguishing a set of data points that are representative of the particular physical object within the 3-D environment image from other data points included in the 3-D environment image; and
storing an indication of the association between the particular label and the data indicative of the boundary of the particular physical object within the 3-D environment image in one or more tangible memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously operate vehicles.
5 Assignments
0 Petitions
Accused Products
Abstract
Techniques for identifying and labeling distinct objects within 3-D images of environments in which vehicles operate, to thereby generate training data used to train models that autonomously control and/or operate vehicles, are disclosed. A 3-D image may be presented from various perspective views (in some cases, dynamically), and/or may be presented with a corresponding 2-D environment image in a side-by-side and/or a layered manner, thereby allowing a user to more accurately identify groups/clusters of data points within the 3-D image that represent distinct objects. Automatic identification/delineation of various types of objects depicted within 3-D images, automatic labeling of identified/delineated objects, and automatic tracking of objects across various frames of a 3-D video are disclosed. A user may modify and/or refine any automatically generated information. Further, at least some of the techniques described herein are equally applicable to 2-D images.
-
Citations
30 Claims
-
1. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
-
displaying, on a first display area of a user interface, a three-dimensional (3-D) video of an environment in which vehicles operate, the 3-D environment video including respective 3-D images of one or more physical objects located in the environment; displaying, on a second display area of the user interface and in a time-synchronized manner with the 3-D video, a two-dimensional (2-D) video of at least a portion of the environment depicted in the 3-D environment video so that an image of the environment that was obtained at a particular time and that is included in the 2-D video is presented simultaneously on the user interface with an image of the environment that was obtained at the particular time and that is included in the 3-D video; receiving, via one or more user controls provided by the user interface, an indication of a boundary of a particular physical object depicted within a 3-D environment image included in the 3-D video; generating data indicative of the boundary of the particular physical object within the 3-D environment image; receiving an indication of a particular label for the particular physical object; associating the particular label for the particular physical object with the data indicative of the boundary of the particular physical object within the 3-D environment image, thereby distinguishing a set of data points that are representative of the particular physical object within the 3-D environment image from other data points included in the 3-D environment image; and storing an indication of the association between the particular label and the data indicative of the boundary of the particular physical object within the 3-D environment image in one or more tangible memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously operate vehicles. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
-
displaying, on a user interface, a three-dimensional (3-D) video of an environment in which vehicles operate, the 3-D environment video including respective 3-D images of one or more physical objects located in the environment; dynamically varying a perspective view from which the 3-D video is displayed on the user interface across multiple perspective views during a presentation of the 3-D video responsive to one or more user instructions that are received, via the user interface, during the presentation of the 3-D video; receiving, via one or more user controls provided by the user interface, an indication of a boundary of a particular physical object depicted within a 3-D environment image included in the 3-D video; generating data indicative of the boundary of the particular physical object within the 3-D environment image; receiving an indication of a particular label for the particular physical object; associating the particular label for the particular physical object with the data indicative of the boundary of the particular physical object within the 3-D environment image, thereby distinguishing a set of data points that are representative of the particular physical object within the 3-D environment image from other data points included in the 3-D environment image; and storing an indication of the association between the particular label and the data indicative of the boundary of the particular physical object within the 3-D environment image in one or more tangible memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously operate vehicles. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system for identifying and labeling objects within images for training machine-learning based models that are used to operate vehicles, the system comprising:
-
a user interface; one or more processors; and one or more non-transitory, tangible memories coupled to the one or more processors and storing computer executable instructions thereon that, when executed by the one or more processors, cause the system to; display, on a first display area of the user interface, a three-dimensional (3-D) video of an environment in which vehicles operate, the 3-D environment video including respective 3-D images of one or more physical objects located in the environment; display, and a second display area of the user interface and in a time-synchronized manner with the 3-D video, a two-dimensional (2-D) video of the environment depicted in the 3-D environment video so that an image of the environment that was obtained at a particular time and that is included in the 2-D video is presented simultaneously on the user interface with an image of the environment that was obtained at the particular time and that is included in the 3-D video receive, via the user interface, an indication of a boundary of a particular physical object depicted within a 3-D environment image included in the 3-D video, the indication of the boundary of the particular physical object provided by a user via the user interface; generate data indicative of the boundary of the particular physical object within the 3-D environment image based on the received indication of the boundary of the particular physical object depicted within the 3-D environment image; and store, in the one or more non-transitory, tangible memories, data indicative of a particular label descriptive of the particular physical object in association with the data indicative of the boundary of the particular physical object within the 3-D environment image, thereby distinguishing a set of data points that are representative of the particular physical object within the 3-D environment image from other data points included in the 3-D environment image. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A system for identifying and labeling objects within images for training machine-learning based models that are used to autonomously operate vehicles, the method comprising:
-
a user interface; one or more processors; and one or more non-transitory, tangible memories coupled to the one or more processors and storing computer executable instructions thereon that, when executed by the one or more processors, cause the system to; display, on the user interface, a three-dimensional (3-D) video of an environment in which vehicles operate, the 3-D environment video including respective 3-D images of one or more physical objects located in the environment; dynamically vary a perspective view from which the 3-D video is displayed on the user interface across multiple perspective views during a presentation of the 3-D video responsive to one or more user instructions that are received, via the user interface, during the presentation of the 3-D video; receive, via one or more user controls provided by the user interface, an indication of a boundary of a particular physical object depicted within a 3-D environment image included in the 3-D video; generate data indicative of the boundary of the particular physical object within the 3-D environment image; receive an indication of a particular label for the particular physical object; associate the particular label for the particular physical object with the data indicative of the boundary of the particular physical object within the 3-D environment image, thereby distinguishing a set of data points that are representative of the particular physical object within the 3-D environment image from other data points included in the 3-D environment image; and store an indication of the association between the particular label and the data indicative of the boundary of the particular physical object within the 3-D environment image in the one or more tangible memories or in another one or more tangible memories as a part of a training data set utilized to train one or more machine-learning based models, the one or more machine-learning based models used to autonomously operate vehicles. - View Dependent Claims (26, 27, 28, 29, 30)
-
Specification