Apparatus and methods for training of robotic control arbitration
First Claim
1. A processor-implemented method of learning arbitration for two physical tasks by a controller of a robot, the method being performed by one or more processors configured to execute computer program modules, the method comprising:
- during a given training trial of a plurality of trials;
receiving a control signal configured to indicate a simultaneous execution of two physical tasks by the robot;
selecting one of the two physical tasks;
evaluating an error measure determined based on a target physical task and an execution of the selected one of the two physical tasks by the robot, the two physical tasks comprising a first physical task and a second physical task;
based on the error measure being within a target range from a previous error measure obtained during a previous training trial of the plurality of trials and prior to the given training trial, receiving a reinforcement signal comprising information associated with the target physical task, and associating the target physical task to the selected one of the two physical tasks; and
during a subsequent training trial of a plurality of trials;
based on the reinforcement signal, determining an association between a sensory context and the target physical task, and when the association is determined, executing the target physical task via the robot based on (1) an occurrence of the sensory context after the given training trial during the subsequent training trial of the plurality of trials, and (2) an absence of receiving the reinforcement signal during the subsequent training trial.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus and methods for arbitration of control signals for robotic devices. A robotic device may comprise an adaptive controller comprising a plurality of predictors configured to provide multiple predicted control signals based on one or more of the teaching input, sensory input, and/or performance. The predicted control signals may be configured to cause two or more actions that may be in conflict with one another and/or utilize a shared resource. An arbitrator may be employed to select one of the actions. The selection process may utilize a WTA, reinforcement, and/or supervisory mechanisms in order to inhibit one or more predicted signals. The arbitrator output may comprise target state information that may be provided to the predictor block. Prior to arbitration, the predicted control signals may be combined with inputs provided by an external control entity in order to reduce learning time.
-
Citations
20 Claims
-
1. A processor-implemented method of learning arbitration for two physical tasks by a controller of a robot, the method being performed by one or more processors configured to execute computer program modules, the method comprising:
-
during a given training trial of a plurality of trials; receiving a control signal configured to indicate a simultaneous execution of two physical tasks by the robot; selecting one of the two physical tasks; evaluating an error measure determined based on a target physical task and an execution of the selected one of the two physical tasks by the robot, the two physical tasks comprising a first physical task and a second physical task; based on the error measure being within a target range from a previous error measure obtained during a previous training trial of the plurality of trials and prior to the given training trial, receiving a reinforcement signal comprising information associated with the target physical task, and associating the target physical task to the selected one of the two physical tasks; and during a subsequent training trial of a plurality of trials; based on the reinforcement signal, determining an association between a sensory context and the target physical task, and when the association is determined, executing the target physical task via the robot based on (1) an occurrence of the sensory context after the given training trial during the subsequent training trial of the plurality of trials, and (2) an absence of receiving the reinforcement signal during the subsequent training trial. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computerized system for learning task arbitration by a robot, the system comprising:
-
an interface configured to detect a reinforcement signal; a processing component; and a non-transitory memory configured to store a plurality of computer instructions that when executed by the processing component, are configured to cause the computerized system to; during a given training trial of a plurality of training trials; receive a control signal configured to indicate a simultaneous execution of two physical tasks by the robot; select one task of the two physical tasks based on a selection signal associated with the selected one task; determine an error measure based on a target physical task and an execution of the selected one task of the two physical tasks by the robot, the two physical tasks comprising a first physical task and a second physical task; based on the error measure being within a desired range from a previous error measure obtained during another training trial of the plurality of training trials and prior to the given training trial, evaluate the reinforcement signal comprising information associated with the target physical task, the target physical task being associated with one of the two physical tasks; and responsive to the evaluation of the reinforcement signal, determine an association between a sensory context and the target physical task, and execute the target physical task via the robot based on (1) an occurrence of the sensory context after the given training trial during a subsequent training trial of the plurality of training trials, (2) an absence of a receipt of the reinforcement signal during the subsequent training trial, and (3) the determined association.
-
Specification