Apparatus and methods for control of robot actions based on corrective user inputs
First Claim
1. A method for performing robot actions by a robot, the method comprising:
- defining a policy comprising a plurality of parameters for determining robot actions based at least in part on sensory-data inputs, the defining of the policy comprising mapping the sensory-data inputs to robot actions;
receiving a first sensory-data input from a sensor;
performing a first robot action at a first action time, wherein the first robot action is determined based at least in part on the first sensory-data input and application of the policy;
determining that a user input was received at an input time corresponding to the first action time, wherein a corrective command at least partially derived from the user input specifies a corrective robot action for physical performance, the user input being indicative of at least partial dissatisfaction with the first robot action; and
modifying the policy based on the corrective command and the first sensory-data input.
1 Assignment
0 Petitions
Accused Products
Abstract
Robots have the capacity to perform a broad range of useful tasks, such as factory automation, cleaning, delivery, assistive care, environmental monitoring and entertainment. Enabling a robot to perform a new task in a new environment typically requires a large amount of new software to be written, often by a team of experts. It would be valuable if future technology could empower people, who may have limited or no understanding of software coding, to train robots to perform custom tasks. Some implementations of the present invention provide methods and systems that respond to users'"'"' corrective commands to generate and refine a policy for determining appropriate actions based on sensor-data input. Upon completion of learning, the system can generate control commands by deriving them from the sensory data. Using the learned control policy, the robot can behave autonomously.
-
Citations
21 Claims
-
1. A method for performing robot actions by a robot, the method comprising:
-
defining a policy comprising a plurality of parameters for determining robot actions based at least in part on sensory-data inputs, the defining of the policy comprising mapping the sensory-data inputs to robot actions; receiving a first sensory-data input from a sensor; performing a first robot action at a first action time, wherein the first robot action is determined based at least in part on the first sensory-data input and application of the policy; determining that a user input was received at an input time corresponding to the first action time, wherein a corrective command at least partially derived from the user input specifies a corrective robot action for physical performance, the user input being indicative of at least partial dissatisfaction with the first robot action; and modifying the policy based on the corrective command and the first sensory-data input. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A robot, comprising:
-
an actuator configured to perform robot actions for robotic tasks; a sensor configured to detect an environmental context of the robot and generate sensory-data inputs; and a processor apparatus configured to; define a policy comprising a plurality of parameters configured to determine robot actions based at least in part on sensory-data inputs; determine that a user input was received at an input time corresponding to a performance of a first robot action corresponding to a detection of a first sensory-data input; generate a corrective command at least partially derived from the user input, the user input being indicative of at least partial dissatisfaction with the first robot action, and modify the policy based on the corrective command and the first sensory-data input. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable storage medium having a plurality of instructions stored thereon, the instructions being executable by a processing apparatus to operate a robot, the instructions configured to, when executed by the processing apparatus, cause the processing apparatus to:
-
define a policy comprising a plurality of parameters configured to determine robot actions based at least in part on sensory-data inputs, wherein the policy maps the sensory-data inputs to robot actions; receive a first sensory-data input; perform a first robot action at a first action time, wherein the first action is determined based at least in part on the first sensory-data input and application of the policy; determine that a user input was received at an input time corresponding to the first action time, wherein a corrective command at least partially derived from the user input specifies a corrective robot action for physical performance, the user input being indicative of at least partial dissatisfaction with the first robot action; and modify the policy based on the corrective command and the first sensory-data input. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
Specification