×

METHOD AND APPARATUS FOR IMPROVED REWARD-BASED LEARNING USING ADAPTIVE DISTANCE METRICS

  • US 20090099985A1
  • Filed: 10/11/2007
  • Published: 04/16/2009
  • Est. Priority Date: 10/11/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for learning a management policy, comprising:

  • receiving a set of one or more exemplars, where each of the exemplars comprises at least a (state, action) pair for a system;

    initializing a distance metric, where the distance metric computes a distance between pairs of exemplars;

    initializing a function approximator;

    adjusting the distance metric such that a Bellman error measure of the function approximator on the set of exemplars is minimized; and

    deriving the management policy from the function approximator.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×