×

APPROXIMATE VALUE ITERATION WITH COMPLEX RETURNS BY BOUNDING

  • US 20180012137A1
  • Filed: 11/22/2016
  • Published: 01/11/2018
  • Est. Priority Date: 11/24/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for controlling a system, comprising:

  • providing a set of data representing a plurality of states and associated trajectories of an environment of the system;

    iteratively determining an estimate of an optimal control policy for the system, comprising performing the substeps until convergence;

    estimating a long term value for operation at a respective state of the environment over a series of predicted future environmental states;

    using a complex return of the data set to determine a bound to improve the estimated long term value; and

    producing an updated estimate of an optimal control policy dependent on the improved estimate of the long term value; and

    at least one of;

    updating an automated controller for controlling the system with the updated estimate of the optimal control policy, wherein the automated controller operates according to the updated estimate of the optimal control policy to automatically alter at least one of a state of the system and the environment of the system; and

    controlling the system with the updated estimate of the optimal control policy, according to the updated estimate of the optimal control policy to automatically alter at least one of a state of the system and the environment of the system..

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×