Generating a model for an object encountered by a robot
First Claim
Patent Images
1. A method, comprising:
- receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data capturing an object in an environment of the robot;
generating an object model of the object based on the vision sensor data;
generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises;
rendering a first image that renders the object model and that includes first additional content, wherein rendering the first image with the first additional content comprises;
rendering the first image based on a scene that includes the object model and a first additional object model of a first additional object; and
rendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content, wherein rendering the second image with the second additional content comprises;
rendering the second image based on an additional scene that includes the object model and one or both of;
the first additional object model at a pose relative to the object model that is different from that of the scene, anda second additional object model that is not present in the scene;
generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output; and
training a machine learning model based on the training examples.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus related to generating a model for an object encountered by a robot in its environment, where the object is one that the robot is unable to recognize utilizing existing models associated with the robot. The model is generated based on vision sensor data that captures the object from multiple vantages and that is captured by a vision sensor associated with the robot, such as a vision sensor coupled to the robot. The model may be provided for use by the robot in detecting the object and/or for use in estimating the pose of the object.
26 Citations
20 Claims
-
1. A method, comprising:
-
receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data capturing an object in an environment of the robot; generating an object model of the object based on the vision sensor data; generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises; rendering a first image that renders the object model and that includes first additional content, wherein rendering the first image with the first additional content comprises; rendering the first image based on a scene that includes the object model and a first additional object model of a first additional object; and rendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content, wherein rendering the second image with the second additional content comprises; rendering the second image based on an additional scene that includes the object model and one or both of; the first additional object model at a pose relative to the object model that is different from that of the scene, and a second additional object model that is not present in the scene; generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output; and training a machine learning model based on the training examples. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method, comprising:
-
identifying vision sensor data generated by a vision sensor coupled to a robot, the vision sensor data capturing a portion of an environment of the robot; determining, based on application of the vision sensor data to one or more object models or machine learning models, that an object in the environment is not recognizable based on the object models or the machine learning models; in response to determining the object is not recognizable, capturing additional object vision sensor data with the vision sensor, the capturing comprising moving at least one of the vision sensor and the object to capture, in the additional object vision sensor data, the object from a plurality of vantages; providing the additional object vision sensor data to a model generation system; receiving a model of the object in response to providing the additional object vision sensor data, the model being an additional object model, or an additional machine learning model trained based on the additional object vision sensor data; using, by the robot, the received model to perform one or both of; detecting the object based on the received model and further vision sensor data generated by the vision sensor coupled to the robot, and estimating the pose of the object based on the received model and the further vision sensor data. - View Dependent Claims (13, 14, 15)
-
-
16. A system, comprising:
-
a robot that includes; a vision sensor capturing a portion of an environment of the robot; one or more robot processors configured to; determine that an object in the environment is not recognizable, in response to determining the object is not recognizable, cause the vision sensor to capture vision sensor data that captures the object from a plurality of vantages, and submit a request for a model for the object, the request including the vision sensor data that captures the object from the plurality of vantages; a model generation system implemented by one or more processors, wherein the model generation system is configured to; receive the request, in response to the request, generate the model for the object based on the vision sensor data that captures the object from the plurality of vantages, wherein the model enables one or both of;
detection of the object based on further vision sensor data and estimation of a pose of the object based on the further vision sensor data;wherein one or more of the robot processors are further configured to use the generated model with further vision sensor data from the vision sensor to detect the object and/or estimate the pose of the object. - View Dependent Claims (17, 18, 19)
-
-
20. A method, comprising:
-
receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data comprising a plurality of images of an object in an environment of the robot; generating an object model of the object based on the vision sensor data; generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises; rendering a first image that renders the object model and that includes first additional content and rendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content; generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output; generating a plurality of additional training examples that each include additional training example input based on a corresponding one of the images of the object in the environment and that each include the indication of the object as training example output; and training a machine learning model based on the training examples and based on one or more of the images of the object in the environment.
-
Specification