Deep machine learning methods and apparatus for robotic grasping
First Claim
Patent Images
1. A method implemented by one or more processors, comprising:
- generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a current pose to an additional pose;
identifying a current image captured by a vision sensor associated with the robot, the current image capturing the grasping end effector and at least one object in an environment of the robot;
applying the current image and the candidate end effector motion vector as input to a trained grasp convolutional neural network;
generating, over the trained grasp convolutional neural network, a measure of successful grasp of the object with application of the motion, the measure being generated based on the application of the image and the end effector motion vector to the trained grasp convolutional neural network;
identifying a desired object semantic feature;
applying, as input to a semantic convolutional neural network, a spatial transformation of the current image or of an additional image captured by the vision sensor;
generating, over the semantic convolutional neural network based on the spatial transformation, an additional measure that indicates whether the desired object semantic feature is present in the spatial transformation;
generating an end effector command based on the measure of successful grasp and the additional measure that indicates whether the desired object semantic feature is present; and
providing the end effector command to one or more actuators of the robot.
2 Assignments
0 Petitions
Accused Products
Abstract
Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
43 Citations
18 Claims
-
1. A method implemented by one or more processors, comprising:
-
generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a current pose to an additional pose; identifying a current image captured by a vision sensor associated with the robot, the current image capturing the grasping end effector and at least one object in an environment of the robot; applying the current image and the candidate end effector motion vector as input to a trained grasp convolutional neural network; generating, over the trained grasp convolutional neural network, a measure of successful grasp of the object with application of the motion, the measure being generated based on the application of the image and the end effector motion vector to the trained grasp convolutional neural network; identifying a desired object semantic feature; applying, as input to a semantic convolutional neural network, a spatial transformation of the current image or of an additional image captured by the vision sensor; generating, over the semantic convolutional neural network based on the spatial transformation, an additional measure that indicates whether the desired object semantic feature is present in the spatial transformation; generating an end effector command based on the measure of successful grasp and the additional measure that indicates whether the desired object semantic feature is present; and providing the end effector command to one or more actuators of the robot. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method implemented by one or more processors, comprising:
-
identifying a current image captured by a vision sensor associated with a robot; generating, over a grasp convolutional neural network based on application of the current image to the grasp convolutional neural network; a measure of successful grasp, by a grasping end effector of the robot, of an object captured in the current image, and spatial transformation parameters; generating, over a spatial transformer network, a spatial transformation based on the spatial transformation parameters, the spatial transformation being of the current image or an additional image captured by the vision sensor; applying the spatial transformation as input to a semantic convolutional neural network; generating, over the semantic convolutional neural network based on the spatial transformation, an additional measure that indicates whether a desired object semantic feature is present in the spatial transformation; generating an end effector command based on the measure and the additional measure; and providing the end effector command to one or more actuators of the robot. - View Dependent Claims (16, 17, 18)
-
Specification