Generating a model for an object encountered by a robot

US 10,055,667 B2
Filed: 08/03/2016
Issued: 08/21/2018
Est. Priority Date: 08/03/2016
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data capturing an object in an environment of the robot;

generating an object model of the object based on the vision sensor data;

generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises;

rendering a first image that renders the object model and that includes first additional content, wherein rendering the first image with the first additional content comprises;

rendering the first image based on a scene that includes the object model and a first additional object model of a first additional object; and

rendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content, wherein rendering the second image with the second additional content comprises;

rendering the second image based on an additional scene that includes the object model and one or both of;

the first additional object model at a pose relative to the object model that is different from that of the scene, anda second additional object model that is not present in the scene;

generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output; and

training a machine learning model based on the training examples.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and apparatus related to generating a model for an object encountered by a robot in its environment, where the object is one that the robot is unable to recognize utilizing existing models associated with the robot. The model is generated based on vision sensor data that captures the object from multiple vantages and that is captured by a vision sensor associated with the robot, such as a vision sensor coupled to the robot. The model may be provided for use by the robot in detecting the object and/or for use in estimating the pose of the object.

26 Citations

View as Search Results

20 Claims

1. A method, comprising:
- receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data capturing an object in an environment of the robot;
  
  generating an object model of the object based on the vision sensor data;
  
  generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises;
  
  rendering a first image that renders the object model and that includes first additional content, wherein rendering the first image with the first additional content comprises;
  
  rendering the first image based on a scene that includes the object model and a first additional object model of a first additional object; and
  
  rendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content, wherein rendering the second image with the second additional content comprises;
  
  rendering the second image based on an additional scene that includes the object model and one or both of;
  
  the first additional object model at a pose relative to the object model that is different from that of the scene, anda second additional object model that is not present in the scene;
  
  generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output; and
  
  training a machine learning model based on the training examples.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein rendering the first image with the first additional content further comprises rendering the object model onto a first background, and wherein rendering the second image with second additional content further comprises rendering the object model onto a second background that is distinct from the first background.
  - 3. The method of claim 2, further comprising:
    - selecting the first background and the second background based on the environment of the robot.
  - 4. The method of claim 3, wherein selecting the first background based on the environment of the robot comprises:
    - selecting the first background based on an additional image captured by the vision sensor or an additional vision sensor in the environment of the robot.
  - 5. The method of claim 1, further comprising:
    - selecting the first additional object model based on the environment of the robot.
  - 6. The method of claim 1, wherein the training example output of each of the training examples further includes a corresponding pose of the object model in the corresponding one of the rendered images.
  - 7. The method of claim 1, wherein the rendered images each include a plurality of color channels and a depth channel.
  - 8. The method of claim 1, wherein the vision sensor data comprises a plurality of images of the object in the environment and further comprising:
    - generating a plurality of additional training examples that each include additional training example input based on a corresponding one of the images of the object in the environment and that each include the indication of the object as training example output;
      
      wherein training the machine learning model is further based on one or more of the images of the object in the environment.
  - 9. The method of claim 8, wherein the additional training example output of each of the additional training examples further includes a corresponding pose of the object in the corresponding one of the images.
  - 10. The method of claim 9, further comprising:
    - determining the pose of the object in a given image of the images based on mapping the given image to the object model.
  - 11. The method of claim 1, wherein the vision sensor data is received from the robot over one or more network interfaces and further comprising:
    - providing, via one or more of the network interfaces, the trained machine learning model to the robot for use by the robot.

12. A method, comprising:
- identifying vision sensor data generated by a vision sensor coupled to a robot, the vision sensor data capturing a portion of an environment of the robot;
  
  determining, based on application of the vision sensor data to one or more object models or machine learning models, that an object in the environment is not recognizable based on the object models or the machine learning models;
  
  in response to determining the object is not recognizable, capturing additional object vision sensor data with the vision sensor, the capturing comprising moving at least one of the vision sensor and the object to capture, in the additional object vision sensor data, the object from a plurality of vantages;
  
  providing the additional object vision sensor data to a model generation system;
  
  receiving a model of the object in response to providing the additional object vision sensor data, the model being an additional object model, or an additional machine learning model trained based on the additional object vision sensor data;
  
  using, by the robot, the received model to perform one or both of;
  
  detecting the object based on the received model and further vision sensor data generated by the vision sensor coupled to the robot, andestimating the pose of the object based on the received model and the further vision sensor data.
- View Dependent Claims (13, 14, 15)
- - 13. The method of claim 12, wherein the model generation system is implemented on one or more remote computing devices and wherein providing the additional object vision sensor data comprises providing the additional object visions sensor data via one or more network interfaces.
  - 14. The method of claim 12, wherein the model is the additional machine learning model trained based on the additional object vision sensor data.
  - 15. The method of claim 14, further comprising:
    - providing additional images to the model generation system, wherein the additional machine learning model is further trained based on the additional images.

16. A system, comprising:
- a robot that includes;
  
  a vision sensor capturing a portion of an environment of the robot;
  
  one or more robot processors configured to;
  
  determine that an object in the environment is not recognizable,in response to determining the object is not recognizable, cause the vision sensor to capture vision sensor data that captures the object from a plurality of vantages, andsubmit a request for a model for the object, the request including the vision sensor data that captures the object from the plurality of vantages;
  
  a model generation system implemented by one or more processors, wherein the model generation system is configured to;
  
  receive the request,in response to the request, generate the model for the object based on the vision sensor data that captures the object from the plurality of vantages, wherein the model enables one or both of;
  
  detection of the object based on further vision sensor data and estimation of a pose of the object based on the further vision sensor data;
  
  wherein one or more of the robot processors are further configured to use the generated model with further vision sensor data from the vision sensor to detect the object and/or estimate the pose of the object.
- View Dependent Claims (17, 18, 19)
- - 17. The system of claim 16, wherein the model for the object is a trained machine learning model and wherein in generating the trained machine learning model for the object based on the vision sensor data that captures the object from the plurality of vantages, the model generation system is to:
    - generate an object model of the object based on the vision sensor data,generate a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images,generate training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output, andtrain the machine learning model based on the training examples.
  - 18. The system of claim 16, wherein in determining that the object in the environment is not recognizable, the one or more robot processors are to:
    - determine that the object is not recognizable based on application of the vision sensor data to one or more object models or machine learning models stored in one or more computer readable media.
  - 19. The system of claim 16, wherein one or more of the processors that implement the model generation system include one or more of the robot processors.

20. A method, comprising:
- receiving vision sensor data generated by a vision sensor associated with a robot, the vision sensor data comprising a plurality of images of an object in an environment of the robot;
  
  generating an object model of the object based on the vision sensor data;
  
  generating a plurality of rendered images based on the object model, wherein the rendered images capture the object model at a plurality of different poses relative to viewpoints of the rendered images, and wherein generating the rendered images based on the object model comprises;
  
  rendering a first image that renders the object model and that includes first additional content andrendering a second image that renders the object model and that includes second additional content that is distinct from the first additional content;
  
  generating training examples that each include a corresponding one of the rendered images as training example input and that each include an indication of the object as training example output;
  
  generating a plurality of additional training examples that each include additional training example input based on a corresponding one of the images of the object in the environment and that each include the indication of the object as training example output; and
  
  training a machine learning model based on the training examples and based on one or more of the images of the object in the environment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
X Development LLC (Alphabet Inc.)
Inventors
Konolige, Kurt, Rajkumar, Nareshkumar, Hinterstoisser, Stefan
Primary Examiner(s)
Grant, II, Jerome

Application Number

US15/227,612
Publication Number

US 20180039848A1
Time in Patent Office

748 Days
Field of Search

382154
US Class Current
CPC Class Codes

B25J 9/161   Hardware, e.g. neural netwo...

B25J 9/1692   Calibration of manipulator

G05B 2219/33038   Real time online learning, ...

G05B 2219/39046   Compare image of plate on r...

G05B 2219/39543   Recognize object and plan h...

G05B 2219/40564   Recognize shape, contour of...

G06T 17/00   Three dimensional [3D] mode...

G06T 2207/10012   Stereo images

G06T 2207/10028   Range image; Depth image; 3...

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 2207/30244   Camera pose

G06T 7/344   involving models

G06T 7/70   Determining position or ori...

G06T 7/75   involving models

G06V 20/10   Terrestrial scenes scenes u...

Generating a model for an object encountered by a robot

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

26 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Generating a model for an object encountered by a robot

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

26 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links