Compliance model training to classify landing page content that violates content item distribution guidelines

US 8,788,442 B1
Filed: 08/01/2011
Issued: 07/22/2014
Est. Priority Date: 12/30/2010
Status: Active Grant

First Claim

Patent Images

1. A method performed by data processing apparatus having memory, a processor, and code stored in the memory and executed in the processor, the method comprising:

receiving by the processor training data that specify manual classifications of content items and feature values for each of the content items, the manual classification for each of the content items specifying whether the content item is a violating content item that violates content item distribution guidelines, the feature values specifying one or more characteristics of the content items and characteristics of landing pages to which the content items link;

training by the processor a compliance model using the training data, the compliance model being trained to classify an unclassified content item as a violating content item based on the feature values of the unclassified content item;

determining by the processor that the compliance model has an accuracy measure that meets a threshold accuracy measure;

in response to determining that the accuracy measure for the compliance model meets the accuracy threshold, classifying by the processor unclassified content items using the feature values for the unclassified content items; and

providing by the processor data specifying the classifications of the unclassified content items;

wherein classifying unclassified content items comprises classifying at least one unclassified content item as a suspicious content item, and further comprising;

providing the suspicious content item to a rater for manual classification;

receiving data specifying a manual classification of the suspicious content item; and

updating the compliance model using the manual classification and the feature values of the suspicious content item.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing content item compliance with specified guidelines. In one aspect, a method includes receiving training data that specify manual classifications of content items and feature values for each of the content items, where each manual classification specifies whether the content item is a violating content item. Using the training data, a compliance model is trained to classify an unclassified content item as a violating content item based on the feature values of the unclassified content item. A determination is made that the compliance model has an accuracy measure that meets a threshold accuracy measure. In response to determining that the accuracy measure for the compliance model meets the accuracy threshold, unclassified content items are classified using the feature values for the unclassified content items, and data specifying the classifications are provided.

Citations

21 Claims

1. A method performed by data processing apparatus having memory, a processor, and code stored in the memory and executed in the processor, the method comprising:
- receiving by the processor training data that specify manual classifications of content items and feature values for each of the content items, the manual classification for each of the content items specifying whether the content item is a violating content item that violates content item distribution guidelines, the feature values specifying one or more characteristics of the content items and characteristics of landing pages to which the content items link;
  
  training by the processor a compliance model using the training data, the compliance model being trained to classify an unclassified content item as a violating content item based on the feature values of the unclassified content item;
  
  determining by the processor that the compliance model has an accuracy measure that meets a threshold accuracy measure;
  
  in response to determining that the accuracy measure for the compliance model meets the accuracy threshold, classifying by the processor unclassified content items using the feature values for the unclassified content items; and
  
  providing by the processor data specifying the classifications of the unclassified content items;
  
  wherein classifying unclassified content items comprises classifying at least one unclassified content item as a suspicious content item, and further comprising;
  
  providing the suspicious content item to a rater for manual classification;
  
  receiving data specifying a manual classification of the suspicious content item; and
  
  updating the compliance model using the manual classification and the feature values of the suspicious content item.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein receiving data specifying a manual classification comprises receiving data specifying that the suspicious content item is a violating content item or a complying content item.
  - 3. The method of claim 1, further comprising:
    - determining an updated accuracy measure for the updated compliance model;
      
      determining that the updated accuracy measure meets the accuracy measure for the compliance model; and
      
      in response to determining that the updated accuracy measure meets the accuracy measure, classifying content items using the updated compliance model.
  - 4. The method of claim 3, wherein:
    - determining that the compliance model has an accuracy measure that meets a threshold accuracy measure comprises;
      
      determining that the compliance model has a precision measure that meets a precision threshold; and
      
      determining that the compliance model has a recall measure that meets a recall threshold; and
      
      determining that the updated accuracy measure meets the accuracy measure for the compliance model comprises;
      
      determining that an updated precision measure for the updated compliance model meets the precision measure for the compliance model; and
      
      determining that an updated recall measure for the updated compliance model meets the recall measure for the compliance model.
  - 5. The method of claim 1, further comprising:
    - determining an updated accuracy measure for the updated compliance model;
      
      determining that the updated accuracy measure does not meet the accuracy measure for the compliance model; and
      
      in response to determining that the updated accuracy measure does not meet the accuracy measure, classifying content items using the compliance model.
  - 6. The method of claim 1, wherein determining that the compliance model has an accuracy measure that meets the threshold accuracy measure comprises determining that the compliance model has at least a minimum area under a curve measure.
  - 7. The method of claim 1, wherein determining that the compliance model has an accuracy measure that meets the threshold accuracy measure comprises determining that the compliance model has at least a minimum precision measure and at least a minimum recall measure.
  - 8. The method of claim 1, wherein training a compliance model comprises, training a separate model for each of two different rules that are specified by the content distribution guidelines, each separate model being trained to classify the unclassified content item as a violating content item for one of the two different rules.
  - 9. The method of claim 1, wherein the content items are advertisements that have been submitted for distribution by an advertisement management system and the landing pages are web pages to which the advertisements redirect users that interact with the advertisements.
  - 10. The method of claim 1, further comprising preventing distribution of violating content items.

11. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
- receiving by the data processing apparatus training data that specify manual classifications of content items and feature values for each of the content items, the manual classification for each of the content items specifying whether the content item is a violating content item that violates content item distribution guidelines, the feature values specifying one or more characteristics of the content items and characteristics of landing pages to which the content items link;
  
  training by the data processing apparatus a compliance model using the training data, the compliance model being trained to classify an unclassified content item as a violating content item based on the feature values of the unclassified content item;
  
  determining by the data processing apparatus that the compliance model has an accuracy measure that meets a threshold accuracy measure;
  
  in response to determining that the accuracy measure for the compliance model meets the accuracy threshold, classifying by the data processing apparatus unclassified content items using the feature values for the unclassified content items;
  
  updating by the data processing apparatus the compliance model using the classifications of the unclassified content items; and
  
  preventing by the data processing a apparatus violating content items from being distributed;
  
  wherein classifying unclassified content items comprises classifying at least one unclassified content item as a suspicious content item, and further comprising;
  
  providing the suspicious content item to a rater for manual classification;
  
  receiving data specifying a manual classification of the suspicious content item; and
  
  updating the compliance model using the manual classification and the feature values of the suspicious content item.

12. A system comprising:
- a data store storing training data that specify manual classifications of content items and feature values for each of the content items, the manual classification for each of the content items specifying whether the content item is a violating content item that violates content item distribution guidelines, the feature values specifying one or more characteristics of the content items and characteristics of landing pages to which the content items link; and
  
  one or more computers, each having a memory, a processor, and code stored in the memory and executable in the processor, the one or more computers being operable to interact with the data store and to cause the processor to perform operations including;
  
  receiving the training data;
  
  training a compliance model using the training data, the compliance model being trained to classify an unclassified content item as a violating content item based on the feature values of the unclassified content item;
  
  determining that the compliance model has an accuracy measure that meets a threshold accuracy measure;
  
  in response to determining that the accuracy measure for the compliance model meets the accuracy threshold, classifying unclassified content items using the feature values for the unclassified content items; and
  
  providing data specifying the classifications of the unclassified content items;
  
  wherein the one or more computers are further operable to perform operations including;
  
  classifying at least one unclassified content item as a suspicious content item, and further comprising;
  
  providing the suspicious content item to a rater for manual classification;
  
  receiving data specifying a manual classification of the suspicious content item; and
  
  updating the compliance model using the manual classification and the feature values of the suspicious content item.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 13. The system of claim 12, wherein the one or more computers are further operable to perform operations including receiving data specifying that the suspicious content item is a violating content item or a complying content item.
  - 14. The system of claim 12, wherein the one or more computers are further operable to perform operations including:
    - determining an updated accuracy measure for the updated compliance model;
      
      determining that the updated accuracy measure meets the accuracy measure for the compliance model; and
      
      in response to determining that the updated accuracy measure meets the accuracy measure, classifying content items using the updated compliance model.
  - 15. The system of claim 14, wherein the one or more computers are further operable to perform operations including:
    - determining that the compliance model has a precision measure that meets a precision threshold;
      
      determining that the compliance model has a recall measure that meets a recall threshold;
      
      determining that an updated precision measure for the updated compliance model meets the precision measure for the compliance model; and
      
      determining that an updated recall measure for the updated compliance model meets the recall measure for the compliance model.
  - 16. The system of claim 12, wherein the one or more computers are further operable to perform operations including:
    - determining an updated accuracy measure for the updated compliance model;
      
      determining that the updated accuracy measure does not meet the accuracy measure for the compliance model; and
      
      in response to determining that the updated accuracy measure does not meet the accuracy measure, classifying content items using the compliance model.
  - 17. The system of claim 12, wherein the one or more computers are further operable to perform operations including determining that the compliance model has at least a minimum area under a curve measure.
  - 18. The system of claim 12, wherein the one or more computers are further operable to perform operations including determining that the compliance model has at least a minimum precision measure and at least a minimum recall measure.
  - 19. The system of claim 12, wherein the one or more computers are further operable to perform operations including training a separate model for each of two different rules that are specified by the content distribution guidelines, each separate model being trained to classify the unclassified content item as a violating content item for one of the two different rules.
  - 20. The system of claim 12, wherein the content items are advertisements that have been submitted for distribution by an advertisement management system and the landing pages are web pages to which the advertisements redirect users that interact with the advertisements.
  - 21. The system of claim 12, wherein the one or more computers are further operable to perform operations including preventing distribution of violating content items.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Sculley, David W. II, Spitznagel, Bridget, Paine, Zach
Primary Examiner(s)
Gaffin, Jeffrey A
Assistant Examiner(s)
BROWN JR, NATHAN H

Application Number

US13/195,172
Time in Patent Office

1,086 Days
Field of Search

706/20
US Class Current

706/20
CPC Class Codes

G06N 20/00 Machine learning

Compliance model training to classify landing page content that violates content item distribution guidelines

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Compliance model training to classify landing page content that violates content item distribution guidelines

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links