×

Smoothed sarsa: reinforcement learning for robot delivery tasks

  • US 8,326,780 B2
  • Filed: 10/13/2009
  • Issued: 12/04/2012
  • Est. Priority Date: 10/14/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for learning a policy for performing a task by a computing system, the method comprising the steps of:

  • determining, by the computing system, a first state associated with a first time interval;

    determining, by the computing system, a subsequent state associated with a subsequent time interval;

    determining, by the computing system, a first action from the first state using the policy, which comprises a plurality of weights, properties of one or more actions and properties of one or more states;

    determining, by the computing system, a subsequent action from the subsequent state using the policy;

    determining, by the computing system, a reward value associated with a combination of the first state and the first action;

    storing, by the computing system, a state description including the first state, the first action, the subsequent state, the subsequent action and the reward value in a non-transitory computer-readable storage medium;

    responsive to a time delay between the first time interval and a current time interval associated with a current state exceeding a delay threshold value or a variance associated with the subsequent state stored in the state description not exceeding a variance threshold value at the current state, calculating, by the computing system, a backup target from the state description;

    modifying, by the computing system, one or more weights of the plurality of weights responsive to the backup target; and

    deleting, by the computing system, the state description from the non-transitory computer-readable storage medium.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×