ONLINE ADAPTIVE FILTERING OF MESSAGES
7 Assignments
0 Petitions
Accused Products
Abstract
In general, a two or more stage spam filtering system is used to filter spam in an e-mail system. One stage includes a global e-mail classifier that classifies e-mail as it enters the e-mail system. The parameters of the global e-mail classifier generally may be a determined by the policies of e-mail system owner and generally are set to only classify as spam those e-mails that are likely to be considered spam by a significant number of users of the e-mail system. Another stage includes personal e-mail classifiers at the individual mailboxes of the e-mail system users. The parameters of the personal e-mail classifiers generally are set by the users through retraining, such that the personal e-mail classifiers are refined to track the subjective perceptions of their respective user as to what e-mails are spam e-mails. Retraining data for the personal e-mail classifiers may be aggregated and a subset of the aggregate may be chosen for use in retraining the global e-mail classifier.
8 Citations
58 Claims
-
1-15. -15. (canceled)
-
16. A method of operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the method comprising:
-
aggregating personal retraining data used to retrain personal, scoring e-mail classifiers that classify messages delivered to the individual message boxes as spam when a score for the messages exceeds a first threshold for classifying the messages as spam, wherein personal retraining data for an individual message box is based on a user'"'"'s feedback about the classes of messages in the user'"'"'s individual message box; selecting a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a score for the messages exceeds a second threshold for classifying the messages as spam, the second threshold being higher than the first threshold; and retraining the global, scoring e-mail classifier based on the global retraining data so as to adjust which messages are classified as spam. - View Dependent Claims (17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29)
-
-
22. (canceled)
-
30-43. -43. (canceled)
-
44. A non-transitory computer-usable medium storing a computer program for operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the computer program comprising instructions for causing at least one processor to:
-
aggregate personal retraining data used to retrain personal, scoring e-mail classifiers that classify messages delivered to the individual message boxes as spam when a score for the messages exceeds a first threshold for classifying the messages as spam, wherein personal retraining data for an individual message box is based on a user'"'"'s feedback about the classes of messages in the user'"'"'s individual message box; select a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a score for the messages exceeds a second threshold for classifying the messages as spam, the second threshold being higher than the first threshold; and retrain the global, scoring e-mail classifier based on the global retraining data so as to adjust which messages are classified as spam. - View Dependent Claims (45, 46, 47, 48, 49, 51, 52, 53, 54, 55, 56)
-
-
50. (canceled)
-
57. (canceled)
-
58. An apparatus for operating a spam filtering system in a messaging system that includes a message gateway and individual message boxes for users of the system, the apparatus comprising:
-
a network interface configured to receive personal retraining data for an individual message box used to retrain personal, scoring e-mail classifiers that classify messages delivered to the individual message boxes as spam when a score for the messages exceeds a first threshold for classifying the messages as spam, wherein the personal retraining data is based on a user'"'"'s feedback about the classes of messages in the user'"'"'s individual message box over one or more network connections; and at least one processor configured by a set of instructions to (i) aggregate the received personal retraining data, (ii) select a subset of the aggregated personal retraining data as global retraining data for retraining a global, scoring e-mail classifier that classifies messages received at a message gateway as spam when a score for the messages exceeds a second threshold for classifying the messages as spam, the second threshold being higher than the first threshold, and (iii) retrain the global, scoring e-mail classifier based on the global retraining data so as to adjust which messages are classified as spam.
-
Specification