×

Online temporal difference learning from incomplete customer interaction histories

  • US 8,914,314 B2
  • Filed: 10/18/2012
  • Issued: 12/16/2014
  • Est. Priority Date: 09/28/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method, comprising:

  • obtaining an indication that a decision has been requested, selected, or applied with respect to one or more users; and

    after obtaining the indication that a decision that has been requested, selected, or applied, updating a time dependent value function, including performing or providing one or more updates to the time dependent value function, wherein a time at which each of the one or more updates is performed or provided is independent of activity of the one or more users;

    wherein the time dependent value function approximates an expected reward as a function of one or more time based variables corresponding to one or more time based values, the expected reward being associated with the one or more users, at least one of the time based variables indicates an elapsed time since a prior or last user event pertaining to at least one of the one or more users, the one or more updates to the value function indicate update(s) to one or more weights associated with one or more parameters of the time dependent value function, and the update(s) to the one or more weights include a modification or replacement value for each of the one or more weights.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×