System and method for sequential decision making for customer relationship management
First Claim
Patent Images
1. A computer-implemented method for sequential decision making for customer relationship management, comprising:
- providing customer data comprising stimulus-response history data for a plurality of customers, said stimulus response history data being derived from event data for said customers;
in a processor, automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data;
estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions; and
transforming an output of the value function estimation into said actionable rules,wherein the estimating of the value function comprises;
estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for the plurality of customers used as training data; and
iteratively applying a regression model to the training data which comprises sequences of states, actions and rewards resulting for said plurality of customers, and updating in each iteration a target reward value for each state-action pair.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for sequential decision-making for customer relationship management includes providing customer data including stimulus-response history data, and automatically generating actionable rules based on the customer data. Further, automatically generating actionable rules may include estimating a value function using reinforcement learning.
17 Citations
17 Claims
-
1. A computer-implemented method for sequential decision making for customer relationship management, comprising:
-
providing customer data comprising stimulus-response history data for a plurality of customers, said stimulus response history data being derived from event data for said customers; in a processor, automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data; estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions; and transforming an output of the value function estimation into said actionable rules, wherein the estimating of the value function comprises; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for the plurality of customers used as training data; and iteratively applying a regression model to the training data which comprises sequences of states, actions and rewards resulting for said plurality of customers, and updating in each iteration a target reward value for each state-action pair. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for generating targeted marketing rules for a customer, said system comprising:
-
a transforming unit for transforming customer transaction data to create derived features said transaction data comprising data for a plurality of customers; a data development unit for using said derived features to develop current customer profile data and combined historical customer profile and stimulus response data; a processor including a data mining unit for performing data mining on the combined data to develop a stimulus-response model; a stimulus optimization unit for performing stimulus optimization using said combined historical customer profile and stimulus-response data and said stimulus-response model with business rules; and a rule generator for generating customer relationship management (CRM) rules by performing data mining on said combined data and said stimulus optimization, wherein said stimulus optimization comprises estimating a value function using batch reinforcement learning with function approximation, the estimating of the value function comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response data for the plurality of customers used as training data; and iteratively applying a regression model to the training data which comprises sequences of states, actions and rewards resulting for said plurality of customers, and updating in each iteration a target reward value for each state-action pair.
-
-
17. A programmable storage medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method for sequential decision-making method for customer relationship management, said method comprising:
-
providing customer data comprising stimulus-response history data for a plurality of customers, said stimulus response history data being derived from event data for said customers; automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data; estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions; and transforming an output of the value function estimation into said actionable rules, wherein the estimating of the value function comprises; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for the plurality of customers used as training data; and iteratively applying a regression model to the training data which comprises sequences of states, actions and rewards resulting for said plurality of customers, and updating in each iteration a target reward value for each state-action pair.
-
Specification