Risk Quantification for Policy Deployment

US 20160148251A1
Filed: 11/24/2014
Published: 05/26/2016
Est. Priority Date: 11/24/2014
Status: Abandoned Application

First Claim

Patent Images

1. In a digital medium environment for identifying and deploying potential digital advertising campaigns, where campaigns can be altered, removed, or replaced on demand, a method for optimizing campaign selection in the digital medium environment, the method comprising:

receiving a policy at one or more computing devices, the policy configured for deployment by a content provider to select advertisements; and

controlling deployment, by the one or more computing devices, of the received policy by the content provider based at least in part on a quantification of risk that is likely involved in the deployment of the received policy as opposed to a deployed policy of the content provider, the controlling including;

applying reinforcement learning and a concentration inequality on deployment data that describes the deployment of the deployed policy by the content provider to estimate values of a measure of performance of the received policy and to quantify the risk by calculating one or more statistical guarantees of the estimated values; and

causing deployment of the received policy responsive to a determination that the one or more statistical guarantees express at least a confidence level that the estimated values of the measure of performance at least correspond to a threshold that is based at least in part on a measure of performance of the deployed policy by the content provider.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Risk quantification, policy search, and automated safe policy deployment techniques are described. In one or more implementations, techniques are utilized to determine safety of a policy, such as to express a level of confidence that a new policy will exhibit an increased measure of performance (e.g., interactions or conversions) over a currently deployed policy. In order to make this determination, reinforcement learning and concentration inequalities are utilized, which generate and bound confidence values regarding the measurement of performance of the policy and thus provide a statistical guarantee of this performance. These techniques are usable to quantify risk in deployment of a policy, select a policy for deployment based on estimated performance and a confidence level in this estimate (e.g., which may include use of a policy space to reduce an amount of data processed), used to create a new policy through iteration in which parameters of a policy are iteratively adjusted and an effect of those adjustments are evaluated, and so forth.

Citations

20 Claims

1. In a digital medium environment for identifying and deploying potential digital advertising campaigns, where campaigns can be altered, removed, or replaced on demand, a method for optimizing campaign selection in the digital medium environment, the method comprising:
- receiving a policy at one or more computing devices, the policy configured for deployment by a content provider to select advertisements; and
  
  controlling deployment, by the one or more computing devices, of the received policy by the content provider based at least in part on a quantification of risk that is likely involved in the deployment of the received policy as opposed to a deployed policy of the content provider, the controlling including;
  
  applying reinforcement learning and a concentration inequality on deployment data that describes the deployment of the deployed policy by the content provider to estimate values of a measure of performance of the received policy and to quantify the risk by calculating one or more statistical guarantees of the estimated values; and
  
  causing deployment of the received policy responsive to a determination that the one or more statistical guarantees express at least a confidence level that the estimated values of the measure of performance at least correspond to a threshold that is based at least in part on a measure of performance of the deployed policy by the content provider.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. A method as described in claim 1, wherein the threshold is based at least in part on the measured performance of the deployed policy and a set margin.
  - 3. A method as described in claim 2, wherein the threshold is set such that the estimated values of the received policy exhibit an improvement in the measurement of performance over the deployed policy.
  - 4. A method as described in claim 1, wherein the confidence level and the threshold are user definable via interaction with a user interface of the one or more computing devices.
  - 5. A method as described in claim 1, wherein the concentration inequality is configured to move estimated values above a defined threshold to lie on the defined threshold.
  - 6. A method as described in claim 1, wherein the concentration inequality is configured to be independent of a range of random variables of the estimated values.
  - 7. A method as described in claim 1, wherein the concentration inequality is configured to collapse tails of random variable distributions of the estimated values, normalize the random variable distributions, and then generate a lower-bound from which a lower-bound on a uniform mean of original random variables of the estimated values is extracted.
  - 8. A method as described in claim 1, wherein the policy is configured for use by the content provider to select advertisements for inclusion with content based at least in part based on characteristics associated with a request to access the content.
  - 9. A method as described in claim 8, wherein the characteristics associated with the request include characteristics of a user or device that initiated the request or characteristics of the request itself.
  - 10. A method as described in claim 8, wherein the characteristics are expressed using a feature vector.
  - 11. A method as described in claim 1, wherein received deployment data does not describe deployment of the received policy by the one or more entities.
  - 12. A method as described in claim 1, wherein received deployment data also describes deployment of the received policy.

13. A system comprising:
- one or more computing devices configured to perform operations including controlling deployment of a received policy based at least in part on a quantification of risk that is likely involved in the deployment of the received policy as opposed to a deployed policy, the controlling including;
  
  using reinforcement learning and a concentration inequality on deployment data that describes the deployment of the deployed policy to estimate values of a measure of performance of the received policy and to quantify the risk by calculating one or more statistical guarantees regarding the estimated values; and
  
  causing replacement of the deployment of the deployed policy with the received policy responsive to a determination that the one or more statistical guarantees express at least a confidence level that the estimated values of the measure of performance at least correspond to a threshold that is based at least in part on a measure of performance of the deployed policy.
- View Dependent Claims (14, 15, 16)
- - 14. A system as described in claim 13, wherein the threshold is based at least in part on the measured performance of the deployed policy and a set margin.
  - 15. A system as described in claim 14, wherein the threshold is set such that the estimated values of the received policy exhibit an improvement in the measurement of performance over the deployed policy.
  - 16. A system as described in claim 13, wherein the confidence level and the threshold are user definable via interaction with a user interface of the one or more computing devices.

17. A content provider comprising one or more computing devices configured to perform operations including:
- deploying a policy to select advertisements to be included with content based on one or more characteristics associated with a request for the content; and
  
  replacing the deployed policy with another policy that is selected through use of reinforcement learning and a concentration inequality to process deployment data and determine that the one or more statistical guarantees express at least a confidence level that estimated values of a measure of performance of the received policy at least corresponds to a threshold that is based at least in part on a measure of performance of the deployed policy.
- View Dependent Claims (18, 19, 20)
- - 18. A content provider as described in claim 17, wherein the threshold is based at least in part on the measure of performance of the deployed policy and a set margin.
  - 19. A content provider as described in claim 17, wherein the threshold is set such that the estimated values of the received policy exhibit an improvement in the measurement of performance over the deployed policy.
  - 20. A content provider as described in claim 17, wherein the confidence level and the threshold are user definable via interaction with a user interface of the one or more computing devices.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Systems Incorporated (Adobe Inc.)
Inventors
Thomas, Philip S., Theocharous, Georgios, Ghavamzadeh, Mohammad

Application Number

US14/552,047
Publication Number

US 20160148251A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06N 20/00   Machine learning

G06N 3/006   based on simulated virtual ...

G06N 7/01   Probabilistic graphical mod...

G06Q 30/0244   Optimization

Risk Quantification for Policy Deployment

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Risk Quantification for Policy Deployment

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links