Feedback loop for spam prevention

US 7,558,832 B2
Filed: 05/02/2007
Issued: 07/07/2009
Est. Priority Date: 03/03/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A cross-validation method that facilitates verifying reliability and trustworthiness of user classifications comprising:

excluding one or more suspected user'"'"'s classifications from data employed to train a spam filter, performed at a server side, based on the user incorrectly classifying email as spam or as not spam;

training the spam filter using all other available user classifications;

running the suspected user'"'"'s polling messages through the trained spam filter to determine how it would have classified the messages compared to the suspected user'"'"'s classifications;

validating users trustworthiness to classify email based on their classification of incoming email as spam that is known or subsequently determined to be spam or their classification of incoming email as not spam that is known or subsequently determined to not be spam; and

discounting existing and future classifications provided by users who are determined to be untrustworthy until the users are determined to be trustworthy.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject invention provides for a feedback loop system and method that facilitate classifying items in connection with spam prevention in server and/or client-based architectures. The invention makes uses of a machine-learning approach as applied to spam filters, and in particular, randomly samples incoming email messages so that examples of both legitimate and junk/spam mail are obtained to generate sets of training data. Users which are identified as spam-fighters are asked to vote on whether a selection of their incoming email messages is individually either legitimate mail or junk mail. A database stores the properties for each mail and voting transaction such as user information, message properties and content summary, and polling results for each message to generate training data for machine learning systems. The machine learning systems facilitate creating improved spam filter(s) that are trained to recognize both legitimate mail and spam mail and to distinguish between them.

Citations

14 Claims

1. A cross-validation method that facilitates verifying reliability and trustworthiness of user classifications comprising:
- excluding one or more suspected user'"'"'s classifications from data employed to train a spam filter, performed at a server side, based on the user incorrectly classifying email as spam or as not spam;
  
  training the spam filter using all other available user classifications;
  
  running the suspected user'"'"'s polling messages through the trained spam filter to determine how it would have classified the messages compared to the suspected user'"'"'s classifications;
  
  validating users trustworthiness to classify email based on their classification of incoming email as spam that is known or subsequently determined to be spam or their classification of incoming email as not spam that is known or subsequently determined to not be spam; and
  
  discounting existing and future classifications provided by users who are determined to be untrustworthy until the users are determined to be trustworthy.
- View Dependent Claims (2)
- - 2. The method of claim 1, further comprising:
    - discarding existing classifications provided by users determined to be permanently untrustworthy; and
      
      removing the permanently untrustworthy users from future polling.

3. A computer readable storage media encoded with a computer program for a system that facilitates classifying items in connection with spam prevention, the computer program comprising:
- a component that receives a set of the items;
  
  a component that identifies intended recipients of the items, and tags a subset of the items to be polled, the subset of items corresponding to a subset of recipients that are known spam fighting users; and
  
  a feedback component that receives information relating to the spam fighting users'"'"' classification of the polled items, and employs the information in connection with training a spam filter and populating a spam list unless one or more of the spam fighting users have been determined untrustworthy.
- View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 4. The system of claim 3, further comprising a component that modifies an item tagged for polling to identify it as a polling item.
  - 5. The system of claim 4, wherein the modified item comprises voting instructions and any one of at least two voting buttons and links which correspond to at least two respective classes of items facilitate classification of the item by the user.
  - 6. The system of claim 5, wherein the voting buttons correspond to respective links such that when any one of the voting buttons is selected by the user, information relating to the selected voting button, the respective user, and the item'"'"'s unique ID assigned thereto is sent to a database for storage.
  - 7. The system of claim 3, wherein the items comprise at least one of:
    - electronic mail (email) and messages.
  - 8. The system of claim 3, wherein the component that receives a set of the items is any one of an email server, a message server, and client email software.
  - 9. The system of claim 3, wherein the subset of items to be polled comprises all of the items received.
  - 10. The system of claim 3, wherein the subset of recipients comprises all recipients.
  - 11. The system of claim 3, wherein the subset of recipients are randomly selected.
  - 12. The system of claim 3, wherein the subset of recipients comprises paying users of the system.

13. A server implementing a system that facilitates classifying items in connection with spam prevention, comprising:
- a processor for executing a computer program encoded in a memory;
  
  the memory encoded with the computer program, the computer program comprising;
  
  a component that receives asset of the items;
  
  a component that identifies intended recipients of the items, and tags a subset of the items to be polled, the subset of items corresponding to a subset of recipients that are known spam fighting users; and
  
  a feedback component that receives information relating to the spam fighting users'"'"' classification of the polled items, and employs the information in connection with training a spam filter and populating a spam list unless the spam fighting users have been determined untrustworthy.

14. An email architecture employing a system that facilitates classifying items in connection with spam prevention, comprising:
- a processor for executing a computer program encoded in a memory;
  
  the memory encoded with the computer program, the computer program comprising;
  
  a component that receives asset of the items;
  
  a component that identifies intended recipients of the items, and tags a subset of the items to be polled, the subset of items corresponding to a subset of recipients that are known spam fighting users; and
  
  a feedback component that receives information relating to the spam fighting users'"'"' classification of the polled items, and employs the information in connection with training a spam filter and populating a spam list unless the spam fighting users have been determined untrustworthy.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Slawson, Dean A., Rounthwaite, Robert L., Goodman, Joshua T., Mehr, John D., Heckerman, David E., Rupersburg, Micah C., Howell, Nathan D.
Primary Examiner(s)
Tran; Philip B

Application Number

US11/743,466
Publication Number

US 20070208856A1
Time in Patent Office

797 Days
Field of Search

709206-207, 709223-224, 709/217, 713/154
US Class Current

709/206
CPC Class Codes

G06Q 10/107 Computer-aided management o...

H04L 51/212 using filtering or selectiv...

Feedback loop for spam prevention

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Feedback loop for spam prevention

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links