Automated Action-Selection System and Method , and Application Thereof to Training Prediction Machines and Driving the Development of Self-Developing Devices
First Claim
1. :
- An automated action-selection system adapted to generate signals specifying values for a set of one or more action variables defining an action that can be taken whereby to affect a setup S, the automated action-selection system comprising;
input means for receiving signals indicative of the value, at a time t, of a set of zero or more system-state/context parameters (SC(t)) describing the state and/or context of the setup S;
a region definer adapted to define a set of regions in a multi-dimensional system-state/context/action space, each dimension of the system-state/context/action space being defined by a respective different parameter or variable of the sets of system-state/context parameters and action variables;
means for determining a set of candidate actions, each candidate action consisting of a possible set of values for the action variables;
a region identifier for identifying the region in system-state/context/action space containing the combination of a given candidate action with values of any system-state/context parameters at time t;
a prediction unit adapted to predict the value of a set of one or more predicted variables (VAR) a predetermined interval after time t, wherein a prediction function applied by the prediction unit depends upon the region in system-state/context/action space containing the combination of this given candidate action with any system-state/context parameters at time t;
calculator means adapted to calculate, for selected candidate actions, a respective indicator of the actual error in the prediction made by the prediction unit for said selected candidate action,memory means for storing indicators of actual prediction errors made by the prediction unit for respective candidate actions selected on one or more previous occasions;
assessment means adapted to evaluate the expected improvement in the performance of the prediction unit if a given candidate action is performed, wherein an assessment performed by the assessment means depends upon the region R in system-state/context/action space containing the combination of this given candidate action with the values, at time t, of any system-state/context parameters, and the assessment means is further adapted to evaluate said expected improvement by comparing an indicator of the actual prediction error that existed on one or more occasions, previous to time t, when the setup S had a combination of system-state/context parameters and action variables located in the same region R of the system-state/context/action; and
means for generating a signal indicating the desirability of selecting a given candidate action for performance, said signal being dependent on the expected improvement in the performance of the prediction unit evaluated by the assessment unit for said given candidate action.
4 Assignments
0 Petitions
Accused Products
Abstract
In order to promote efficient learning of relationships inherent in a system or setup S described by system-state and context parameters, the next action to take, affecting the setup, is determined based on the knowledge gain expected to result from this action. Knowledge-gain is assessed “locally” by comparing the value of a knowledge-indicator parameter after the action with the value of this indicator on one or more previous occasions when the system-state/context parameter(s) and action variable(s)=had similar values to the current ones. Preferably the “level of knowledge” is assessed based on the accuracy of predictions made by a prediction module. This technique can be applied to train a prediction machine by causing it to participate in the selection of a sequence of actions. This technique can also be applied for managing development of a self-developing device or system, the self-developing device or system performing a sequence of actions selected according to the action-selection technique.
61 Citations
25 Claims
-
1. :
- An automated action-selection system adapted to generate signals specifying values for a set of one or more action variables defining an action that can be taken whereby to affect a setup S, the automated action-selection system comprising;
input means for receiving signals indicative of the value, at a time t, of a set of zero or more system-state/context parameters (SC(t)) describing the state and/or context of the setup S; a region definer adapted to define a set of regions in a multi-dimensional system-state/context/action space, each dimension of the system-state/context/action space being defined by a respective different parameter or variable of the sets of system-state/context parameters and action variables; means for determining a set of candidate actions, each candidate action consisting of a possible set of values for the action variables; a region identifier for identifying the region in system-state/context/action space containing the combination of a given candidate action with values of any system-state/context parameters at time t; a prediction unit adapted to predict the value of a set of one or more predicted variables (VAR) a predetermined interval after time t, wherein a prediction function applied by the prediction unit depends upon the region in system-state/context/action space containing the combination of this given candidate action with any system-state/context parameters at time t; calculator means adapted to calculate, for selected candidate actions, a respective indicator of the actual error in the prediction made by the prediction unit for said selected candidate action, memory means for storing indicators of actual prediction errors made by the prediction unit for respective candidate actions selected on one or more previous occasions; assessment means adapted to evaluate the expected improvement in the performance of the prediction unit if a given candidate action is performed, wherein an assessment performed by the assessment means depends upon the region R in system-state/context/action space containing the combination of this given candidate action with the values, at time t, of any system-state/context parameters, and the assessment means is further adapted to evaluate said expected improvement by comparing an indicator of the actual prediction error that existed on one or more occasions, previous to time t, when the setup S had a combination of system-state/context parameters and action variables located in the same region R of the system-state/context/action; and means for generating a signal indicating the desirability of selecting a given candidate action for performance, said signal being dependent on the expected improvement in the performance of the prediction unit evaluated by the assessment unit for said given candidate action. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- An automated action-selection system adapted to generate signals specifying values for a set of one or more action variables defining an action that can be taken whereby to affect a setup S, the automated action-selection system comprising;
Specification