System and method for sequential decision making for customer relationship management
First Claim
Patent Images
1. A method for sequential decision making for customer relationship management, comprising:
- providing customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers;
automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data;
estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating a value function using batch reinforcement learning with function approximation comprising;
estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and
iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and
transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for sequential decision-making for customer relationship management includes providing customer data including stimulus-response history data, and automatically generating actionable rules based on the customer data. Further, automatically generating actionable rules may include estimating a value function using reinforcement learning.
61 Citations
31 Claims
-
1. A method for sequential decision making for customer relationship management, comprising:
-
providing customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data; estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating a value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 30, 31)
-
-
22. A method of sequential targeted marketing for customer relationship management, comprising:
-
preparing customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; and automatically generating actionable rules using said stimulus-response history data to output instance-in-time targeting rules for optimizing a sequence of decisions over a period of time, so as to approximately maximize expected cumulative profits over time; estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating said value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values.
-
-
23. A method for sequential decision making for customer relationship management, comprising:
-
providing a database of customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers, from a plurality of channels; integrating said customer data; and automatically generating actionable channel-specific targeting rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data by estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating said value function using batch reinforcement learning with function approximation comprising comprises; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values. - View Dependent Claims (24)
-
-
25. A system for sequential decision making for customer relationship management, comprising:
-
a database for storing customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; and a processor for automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data by estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating said value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values. - View Dependent Claims (26)
-
-
27. A system for sequential decision making for customer relationship management, comprising:
-
a data preparation device for preparing customer data comprising stimulus-response history data; a value estimator for estimating a value function based on said stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; and a rule transformer for generating actionable rules for optimizing a sequence of decisions over a period of time based on said value function by estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating said value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values.
-
-
28. A system for sequential cost-sensitive decision making for customer relationship management, comprising:
-
a customer transaction cache for storing customer transaction data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; a customer profile cache for receiving an output of said customer transaction cache and storing current customer profile data; and a customer relationship management system, for receiving an output of said customer profile cache and customer relationship management rules for optimizing a sequence of decisions over a period of time, wherein said customer relationship management rules are automatically generated based on said stimulus-response history data by estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating said value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values.
-
-
29. A programmable storage medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method for sequential decision-making method for customer relationship management, said method comprising:
-
providing customer data comprising stimulus-response history data for a population of customers, said stimulus-response history data being derived from event data for said customers; and automatically generating actionable rules for optimizing a sequence of decisions over a period of time based on said stimulus-response history data by estimating a value function using batch reinforcement learning with function approximation, said function approximation representing the value function as a function of state features and actions, and said estimating a value function using batch reinforcement learning with function approximation comprising; estimating a function approximation of the value function of a Markov Decision Process underlying said stimulus-response history data for said population of customers; and iteratively applying a regression model to training data comprising sequences of states, actions and rewards resulting for said population of customers, and updating in each iteration a target reward value for each state-action pair; and transforming an output of a value function estimation into said actionable rules, the rules specifying what actions to take given a set of feature values corresponding to a customer, and the action taken corresponding to an action having an approximate maximum value according to said value function for the given set of feature values.
-
Specification