×

Control policy learning and vehicle control method based on reinforcement learning without active exploration

  • US 10,061,316 B2
  • Filed: 05/12/2017
  • Issued: 08/28/2018
  • Est. Priority Date: 07/08/2016
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for autonomously controlling a vehicle to perform a vehicle operation, the method comprising steps of:

  • applying a passive actor-critic reinforcement learning method to passively-collected data relating to the vehicle operation, to adapt an existing control policy so as to enable control of the vehicle by the control policy so as to perform the vehicle operation with a minimum expected cumulative cost, the step of applying a passive actor-critic reinforcement learning method to passively-collected data including steps of;

    a) in a critic network, estimating a Z-value and an average cost under an optimal control policy using samples of the passively collected data;

    b) in an actor network operatively coupled to the critic network, revising the control policy using samples of the passively collected data, the estimated Z-value, and the estimated average cost under an optimal control policy from the critic network; and

    c) iteratively repeating steps (a)-(b) until the estimated average cost converges; and

    controlling the vehicle in accordance with the adapted control policy to perform the vehicle operation.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×