Online adaptive filtering of messages

US 9,270,625 B2
Filed: 08/05/2014
Issued: 02/23/2016
Est. Priority Date: 07/21/2003
Status: Expired due to Term

First Claim

Patent Images

1. A method of operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the method comprising:

aggregating personal retraining data used to retrain a personal, scoring e-mail classifier that classifies messages delivered to an individual message box as spam when a personal classifying score for the messages exceeds a personal classifier threshold for classifying the messages as spam, wherein the personal retraining data for the individual message box is based on a user'"'"'s feedback about the messages delivered to the individual message box;

selecting a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a global classifying score for the messages exceeds a global classifier threshold for classifying the messages as spam, the global classifier threshold being higher than the personal classifier threshold; and

retraining the global, scoring e-mail classifier based on the global retraining data to adjust which of the messages received at the message gateway are classified as spam.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In general, a two or more stage spam filtering system is used to filter spam in an e-mail system. One stage includes a global e-mail classifier that classifies e-mail as it enters the e-mail system. The parameters of the global e-mail classifier generally may be determined by the policies of e-mail system owner and generally are set to only classify as spam those e-mails that are likely to be considered spam by a significant number of users of the e-mail system. Another stage includes personal e-mail classifiers at the individual mailboxes of the e-mail system users. The parameters of the personal e-mail classifiers generally are set by the users through retraining, such that the personal e-mail classifiers are refined to track the subjective perceptions of their respective user as to what e-mails are spam e-mails. Retraining data for the personal e-mail classifiers may be aggregated and a subset of the aggregate may be chosen for use in retraining the global e-mail classifier.

35 Citations

View as Search Results

26 Claims

1. A method of operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the method comprising:
- aggregating personal retraining data used to retrain a personal, scoring e-mail classifier that classifies messages delivered to an individual message box as spam when a personal classifying score for the messages exceeds a personal classifier threshold for classifying the messages as spam, wherein the personal retraining data for the individual message box is based on a user'"'"'s feedback about the messages delivered to the individual message box;
  
  selecting a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a global classifying score for the messages exceeds a global classifier threshold for classifying the messages as spam, the global classifier threshold being higher than the personal classifier threshold; and
  
  retraining the global, scoring e-mail classifier based on the global retraining data to adjust which of the messages received at the message gateway are classified as spam.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1 wherein the user'"'"'s feedback is explicit.
  - 3. The method of claim 2 wherein the explicit user'"'"'s feedback comprises one or more of the following:
    - the user reporting a message as spam;
      
      moving a message from an inbox folder in the individual message box to a spam folder in the individual message box; and
      
      moving a message from the spam folder in the individual message box to the inbox folder in the individual message box.
  - 4. The method of claim 1 wherein the user'"'"'s feedback is implicit.
  - 5. The method of claim 4 wherein the implicit user'"'"'s feedback comprises one or more of the following:
    - keeping a message as new after the message has been read;
      
      forwarding a message;
      
      replying to a message;
      
      printing a message;
      
      adding a sender of a message to an address book; and
      
      not explicitly changing a classification of a message.
  - 6. The method of claim 1 wherein the aggregated personal retraining data comprises messages delivered to individual message boxes.
  - 7. The method of claim 1 wherein the user'"'"'s feedback comprises changing a classification of a message.
  - 8. The method of claim 7 wherein selecting the subset of the aggregated personal retraining data comprises selecting a message as global retraining data when a particular number of users change the classification of the message.
  - 9. The method of claim 1 wherein the messaging stem is an email messaging system.
  - 10. The method of claim 1 wherein the messaging system is an instant messaging system.
  - 11. The method of claim 1 wherein the messaging system is an SMS messaging system.
  - 12. The method of claim 1 wherein, to classify a message, the global, scoring e-mail classifier uses a global internal model to determine a global probability measure for the message and compares the global probability measure to the global classifier threshold.
  - 13. The method of claim 1 wherein, to classify a message, the personal, scoring e-mail classifier uses a personal internal model to determine a personal probability measure for the message and compares the personal probability measure to the personal classifier threshold, the method further comprising initializing the personal internal model using the global internal model.

14. A non-transitory computer-usable medium storing a computer program for operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the computer program comprising instructions for causing at least one processor to:
- aggregate personal retraining data used to retrain a personal, scoring e-mail classifier that classifies messages delivered to an individual message box as spam when a personal classifying score for the messages exceeds a personal classifier threshold for classifying the messages as spam, wherein the personal retraining data for the individual message box is based on a user'"'"'s feedback about the messages delivered to the user'"'"'s individual message box;
  
  select a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a global classifying score for the messages exceeds a global classifier threshold for classifying the messages as spam, the global classifier threshold being higher than the personal classifier threshold; and
  
  retrain the global, scoring e-mail classifier based on the global retraining data so as to adjust which of the messages received at the message gateway are classified as spam.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 15. The medium of claim 14 wherein the user'"'"'s feedback is explicit.
  - 16. The medium of claim 15 wherein the explicit user'"'"'s feedback comprises one or more of the following:
    - the user reporting a first message as spam;
      
      moving the first message from an inbox folder in the individual message box to a spam folder in the individual message box; and
      
      moving the first message from the spam folder in the individual message box to the inbox folder in the individual message box.
  - 17. The medium of claim 14 wherein the user'"'"'s feedback is implicit.
  - 18. The medium of claim 17 wherein the implicit user'"'"'s feedback comprises one or more of the following:
    - keeping a first message as new after the message has been read;
      
      forwarding the first message;
      
      replying to the first message;
      
      printing the first message;
      
      adding a sender of the first message to an address book; and
      
      not explicitly changing a classification of the first message.
  - 19. The medium of claim 14 wherein the aggregated personal retraining data comprises messages delivered to individual message boxes.
  - 20. The medium of claim 14 wherein the user'"'"'s feedback comprises changing a classification of a first message.
  - 21. The medium of claim 20 wherein to select the subset of the aggregated personal retraining data, the computer program further comprises instructions for causing a processor to select the first message as global retraining data when a particular number of users change the classification of the first message.
  - 22. The medium of claim 14 wherein the messaging system is an email messaging system.
  - 23. The medium of claim 14 wherein the messaging system is an instant messaging system.
  - 24. The medium of claim 14 wherein the messaging system is an SMS messaging system.
  - 25. The medium of claim 14 wherein, to classify a first message, the global, scoring e-mail classifier uses a global internal model to determine a global probability measure for the first message and compares the global probability measure to the global classifier threshold.

26. An apparatus for operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the apparatus comprising:
- at least one memory that stores personal retraining data for an individual message box used to retrain a personal, scoring e-mail classifier that classifies messages delivered to an individual message box as spam when a personal classifying score for the messages exceeds a personal classifier threshold for classifying the messages as spam, wherein the personal retraining data is based on a user'"'"'s feedback about messages delivered to the individual message box over one or more network connections;
  
  at least one memory that stores a set of instructions; and
  
  at least one processor that executes the set of instructions to (i) aggregate the received personal retraining data, (ii) select a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a score for the messages exceeds a global classifier threshold for classifying the messages as spam, the global classifier threshold being higher than the personal classifier threshold, and (iii) retrain the global, scoring e-mail classifier based on the global retraining data so as to adjust which of the messages received at the message gateway are classified as spam.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Yahoo Assets LLC
Original Assignee
AOL Inc. (Apollo Global Management, Inc.)
Inventors
Alspector, Joshua, Kolcz, Aleksander
Primary Examiner(s)
Donabed, Ninos

Application Number

US14/452,224
Publication Number

US 20140344387A1
Time in Patent Office

567 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 16/353   into predefined classes

G06Q 10/107   Computer-aided management o...

H04L 51/212   using filtering or selectiv...

H04L 51/48   Message addressing, e.g. ad...

Online adaptive filtering of messages

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

35 Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Online adaptive filtering of messages

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links