Deep machine learning methods and apparatus for robotic grasping
First Claim
Patent Images
1. A system, comprising:
- a vision sensor viewing an environment of a robot;
a semantic grasping model stored in one or more non-transitory computer readable media;
at least one processor configured to;
identify a current image captured by the vision sensor;
generate, over a portion of the semantic grasping model based on application of the current image to the portion;
a measure of successful grasp, by a grasping end effector of the robot, of an object captured in the current image, wherein the measure of successful grasp indicates, directly or indirectly, a probability, andspatial transformation parameters that indicate a location;
generate a spatial transformation, of the current image or of an additional image captured by the vision sensor, based on the spatial transformation parameters;
apply the spatial transformation as input to an additional portion of the semantic grasping model, wherein the additional portion is a deep neural network;
generate, over the additional portion and based on the spatial transformation, an additional measure that indicates whether a desired object semantic feature is present in the spatial transformation;
generate an end effector command based on the measure and the additional measure; and
provide the end effector command to one or more actuators of the robot.
2 Assignments
0 Petitions
Accused Products
Abstract
Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
-
Citations
12 Claims
-
1. A system, comprising:
-
a vision sensor viewing an environment of a robot; a semantic grasping model stored in one or more non-transitory computer readable media; at least one processor configured to; identify a current image captured by the vision sensor; generate, over a portion of the semantic grasping model based on application of the current image to the portion; a measure of successful grasp, by a grasping end effector of the robot, of an object captured in the current image, wherein the measure of successful grasp indicates, directly or indirectly, a probability, and spatial transformation parameters that indicate a location; generate a spatial transformation, of the current image or of an additional image captured by the vision sensor, based on the spatial transformation parameters; apply the spatial transformation as input to an additional portion of the semantic grasping model, wherein the additional portion is a deep neural network; generate, over the additional portion and based on the spatial transformation, an additional measure that indicates whether a desired object semantic feature is present in the spatial transformation; generate an end effector command based on the measure and the additional measure; and provide the end effector command to one or more actuators of the robot. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method implemented by one or more processors, comprising:
-
identifying a desired object semantic feature for a grasp attempt; generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a current pose to an additional pose; identifying a current image captured by a vision sensor associated with the robot, the current image capturing the grasping end effector and an object in an environment of the robot; applying the current image and the candidate end effector motion vector as input to a trained semantic grasping model; generating, based on processing of the current image and the candidate end effector motion vector using the trained semantic grasping model; a measure of successful grasp of the object with application of the motion, and an additional measure that indicates whether the object has the desired object semantic feature; generating a grasp command based on determining that the measure of successful grasp satisfies one or more criteria and that the additional measure indicates that the object has the desired object semantic feature; and providing the grasp command to one or more actuators of the robot to cause the end effector to attempt a grasp of the object. - View Dependent Claims (11, 12)
-
Specification