Controller for partially observable systems
First Claim
1. A controller, for controlling a system on the basis of measurement data received from a plurality of sensors indicative of a state of the system, wherein the controller comprises:
- a system model and corresponding measurement models for the plurality of sensors of the system;
a stochastic estimator for receiving measurement data from the plurality of sensors and for generating, with reference to the system model, a plurality of samples each representative of the state of the system;
a rule set corresponding to the system model, defining, for each of a plurality of possible samples representing possible states of the system, information defining an action to be carried out in the system; and
an action selector, for receiving an output of the stochastic estimator and for selecting, with reference to the rule set, information defining one or more corresponding actions to be performed in the system;
wherein the controller is configured to;
(i) construct an initial partially observed Markov decision process (POMDP) model representing the dynamics of the system to be controlled, wherein the POMDP model comprises a representation of the states of the system, a measurement model, one or more control actions, and measures of benefit likely to arise from the selection of particular control actions;
(ii) transform the initial POMDP model into a subsidiary Markov decision process (MDP) model, comprising generating a sample state space representation for the subsidiary model, and generating an initial probabilistic system model and control rule set using the sample state representation; and
(iii) use observations of the system and of the environment by a plurality of sensors to update the control rule set and the probabilistic system model of the subsidiary MDP based upon the observed effects of selected control actions and with reference to the measures of the benefit system.
1 Assignment
0 Petitions
Accused Products
Abstract
A controller is provided, operable to control a system on the basis of measurement data received from a plurality of sensors indicative of a state of the system, with at least partial autonomy, but in environments in which it is not possible to fully determine the state of the system on the basis of such sensor measurement data. The controller, includes: a system model, defining at least a set of probabilities for the dynamical evolution of the system and corresponding measurement models for the plurality of sensors of the system; a stochastic estimator operable to receive measurement data from the sensors and, with reference to the system model, to generate a plurality of samples each representative of the state of the system; a rule set corresponding to the system model, defining, for each of a plurality of possible samples representing possible states of the system, information defining an action to be carried out in the system; and an action selector, operable to receive an output of the stochastic estimator and to select, with reference to the rule set, information defining one or more corresponding actions to be performed in the system.
-
Citations
9 Claims
-
1. A controller, for controlling a system on the basis of measurement data received from a plurality of sensors indicative of a state of the system, wherein the controller comprises:
-
a system model and corresponding measurement models for the plurality of sensors of the system; a stochastic estimator for receiving measurement data from the plurality of sensors and for generating, with reference to the system model, a plurality of samples each representative of the state of the system; a rule set corresponding to the system model, defining, for each of a plurality of possible samples representing possible states of the system, information defining an action to be carried out in the system; and an action selector, for receiving an output of the stochastic estimator and for selecting, with reference to the rule set, information defining one or more corresponding actions to be performed in the system; wherein the controller is configured to; (i) construct an initial partially observed Markov decision process (POMDP) model representing the dynamics of the system to be controlled, wherein the POMDP model comprises a representation of the states of the system, a measurement model, one or more control actions, and measures of benefit likely to arise from the selection of particular control actions; (ii) transform the initial POMDP model into a subsidiary Markov decision process (MDP) model, comprising generating a sample state space representation for the subsidiary model, and generating an initial probabilistic system model and control rule set using the sample state representation; and (iii) use observations of the system and of the environment by a plurality of sensors to update the control rule set and the probabilistic system model of the subsidiary MDP based upon the observed effects of selected control actions and with reference to the measures of the benefit system.
-
-
2. A method for controlling a system that enables autonomous operation of the system in an environment in which selected control actions have uncertain consequences, the method comprising:
-
(i) constructing an initial partially observed Markov decision process (POMDP) model representing the dynamics of the system to be controlled, wherein the POMDP model includes a representation of the states of the system, a measurement model, one or more control actions, and measures of a benefit likely to arise from the selection of particular control actions; (ii) transforming the initial POMDP model into a subsidiary Markov decision process (MDP) model, including generating a sample state space representation for the subsidiary model, and generating an initial probabilistic system model and control rule set using the sample state representation; and (iii) using observations of the system and of the environment by a plurality of sensors to update the control rule set and the probabilistic system model of the subsidiary MDP based upon the observed effects of selected control actions and with reference to the measures of the benefit system. - View Dependent Claims (3, 4, 5)
-
-
6. A computer readable medium having a computer program, which is executable by a computer, comprising:
-
a program code arrangement having program code for controlling a system that enables autonomous operation of the system in an environment in which selected control actions have uncertain consequences, by performing the following; (i) constructing an initial partially observed Markov decision process (POMDP) model representing the dynamics of the system to be controlled, wherein the POMDP model includes a representation of the states of the system, a measurement model, one or more control actions, and measures of a benefit likely to arise from the selection of particular control actions; (ii) transforming the initial POMDP model into a subsidiary Markov decision process (MDP) model, including generating a sample state space representation for the subsidiary model, and generating an initial probabilistic system model and control rule set using the sample state representation; and (iii) using observations of the system and of the environment by a plurality of sensors to update the control rule set and the probabilistic system model of the subsidiary MDP based upon the observed effects of selected control actions and with reference to the measures of the benefit system. - View Dependent Claims (7, 8, 9)
-
Specification