Online adaptive filtering of messages
First Claim
1. A method of handling messages in a messaging system that includes a message gateway and individual message boxes for users of the system, wherein a message addressed to a user is delivered to the user'"'"'s message box after passing through the message gateway, the method comprising:
- knowingly biasing a global, scoring e-mail classifier relative to a personal, scoring e-mail classifier such that the global, scoring e-mail classifier is less stringent than the personal, scoring e-mail classifier as to what is classified as spam, wherein the global, scoring e-mail classifier and the personal, scoring e-mail classifier are probabilistic e-mail classifiers such that, to classify a message, the global, scoring e-mail classifier and the personal, scoring e-mail classifier use respective internal models to determine a probability measure for the message and compare the probability measure to a classification threshold;
receiving messages at the message gateway;
inputting the received messages into the global, scoring e-mail classifier to classify the input messages as spam or non-spam;
handling at least one of the messages input into the global, scoring e-mail classifier based on whether the global, scoring e-mail classifier classified the at least one message as spam or non-spam;
outputting at least one message from the global, scoring e-mail classifier, wherein the outputted message has been classified as non-spam by the global, scoring e-mail classifier;
inputting the outputted message from the global, scoring e-mail classifier into the personal, scoring e-mail classifier to classify the at least one outputted message as spam or non-spam;
handling the at least one outputted message input into the personal, scoring e-mail classifier based on whether the personal, scoring e-mail classifier classified the at least one outputted message as spam or non-spam;
receiving an indication from a user to change the classification of the at least one outputted message;
in response to the indication, changing the classification of the at least one outputted message;
generating retraining data based on the change to the classification of the at least one outputted message; and
retraining the personal, scoring e-mail classifier based on the generated retraining data such that the personal, scoring e-mail classifier'"'"'s internal model is refined to track the user'"'"'s subjective perceptions as to what messages constitute spam messages.
12 Assignments
0 Petitions
Accused Products
Abstract
In general, a two or more stage spam filtering system is used to filter spam in an e-mail system. One stage includes a global e-mail classifier that classifies e-mail as it enters the e-mail system. The parameters of the global e-mail classifier generally may be determined by the policies of e-mail system owner and generally are set to only classify as spam those e-mails that are likely to be considered spam by a significant number of users of the e-mail system. Another stage includes personal e-mail classifiers at the individual mailboxes of the e-mail system users. The parameters of the personal e-mail classifiers generally are set by the users through retraining, such that the personal e-mail classifiers are refined to track the subjective perceptions of their respective user as to what e-mails are spam e-mails. Retraining data for the personal e-mail classifiers may be aggregated and a subset of the aggregate may be chosen for use in retraining the global e-mail classifier.
-
Citations
24 Claims
-
1. A method of handling messages in a messaging system that includes a message gateway and individual message boxes for users of the system, wherein a message addressed to a user is delivered to the user'"'"'s message box after passing through the message gateway, the method comprising:
-
knowingly biasing a global, scoring e-mail classifier relative to a personal, scoring e-mail classifier such that the global, scoring e-mail classifier is less stringent than the personal, scoring e-mail classifier as to what is classified as spam, wherein the global, scoring e-mail classifier and the personal, scoring e-mail classifier are probabilistic e-mail classifiers such that, to classify a message, the global, scoring e-mail classifier and the personal, scoring e-mail classifier use respective internal models to determine a probability measure for the message and compare the probability measure to a classification threshold; receiving messages at the message gateway; inputting the received messages into the global, scoring e-mail classifier to classify the input messages as spam or non-spam; handling at least one of the messages input into the global, scoring e-mail classifier based on whether the global, scoring e-mail classifier classified the at least one message as spam or non-spam; outputting at least one message from the global, scoring e-mail classifier, wherein the outputted message has been classified as non-spam by the global, scoring e-mail classifier; inputting the outputted message from the global, scoring e-mail classifier into the personal, scoring e-mail classifier to classify the at least one outputted message as spam or non-spam; handling the at least one outputted message input into the personal, scoring e-mail classifier based on whether the personal, scoring e-mail classifier classified the at least one outputted message as spam or non-spam; receiving an indication from a user to change the classification of the at least one outputted message; in response to the indication, changing the classification of the at least one outputted message; generating retraining data based on the change to the classification of the at least one outputted message; and retraining the personal, scoring e-mail classifier based on the generated retraining data such that the personal, scoring e-mail classifier'"'"'s internal model is refined to track the user'"'"'s subjective perceptions as to what messages constitute spam messages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable medium including a set of instructions which, when executed by a processor, performs a method of handling messages in a messaging system that includes a message gateway and individual message boxes for users of the system, wherein a message addressed to a user is delivered to the user'"'"'s message box after passing through the message gateway, the method comprising:
-
knowingly biasing a global, scoring e-mail classifier relative to a personal, scoring e-mail classifier such that the global, scoring e-mail classifier is less stringent than the personal, scoring e-mail classifier as to what is classified as spam, wherein the global, scoring e-mail classifier and the personal, scoring e-mail classifier are probabilistic e-mail classifiers such that, to classify a message, the global, scoring e-mail classifier and the personal, scoring e-mail classifier use respective internal models to determine a probability measure for the message and compare the probability measure to a classification threshold; receiving messages at the message gateway; inputting the received messages into the global, scoring e-mail classifier to classify the input messages as spam or non-spam; handling at least one of the messages input into the global, scoring e-mail classifier based on whether the global, scoring e-mail classifier classified the at least one message as spam or non-spam; outputting at least one message from the global, scoring e-mail classifier, wherein the outputted message has been classified as non-spam by the global, scoring e-mail classifier; inputting the outputted message from the global, scoring e-mail classifier into the personal, scoring e-mail classifier to classify the at least one outputted message as spam or non-spam; handling the at least one outputted message input into the personal, scoring e-mail classifier based on whether the personal, scoring e-mail classifier classified the at least one outputted message as spam or non-spam; receiving an indication from a user to change the classification of the at least one outputted message; in response to the indication, changing the classification of the at least one outputted message; generating retraining data based on the change to the classification of the at least one outputted message; and retraining the personal, scoring e-mail classifier based on the generated retraining data such that the personal, scoring e-mail classifier'"'"'s internal model is refined to track the user'"'"'s subjective perceptions as to what messages constitute spam messages. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for handling messages in a messaging system that includes a message gateway and individual message boxes for users of the system, wherein a message addressed to a user is delivered to the user'"'"'s message box after passing through the message gateway, the system comprising:
-
a storage medium that stores a set of instructions; at least one processor that executes the set of instructions to perform a method, the method comprising; knowingly biasing a global, scoring e-mail classifier relative to a personal, scoring e-mail classifier such that the global, scoring e-mail classifier is less stringent than the personal, scoring e-mail classifier as to what is classified as spam, wherein the global, scoring e-mail classifier and the personal, scoring e-mail classifier are probabilistic e-mail classifiers such that, to classify a message, the global, scoring e-mail classifier and the personal, scoring e-mail classifier use respective internal models to determine a probability measure for the message and compare the probability measure to a classification threshold; receiving messages at the message gateway; inputting the received messages into the global, scoring e-mail classifier to classify the input messages as spam or non-spam; handling at least one of the messages input into the global, scoring e-mail classifier based on whether the global, scoring e-mail classifier classified the at least one message as spam or non-spam; outputting at least one message from the global, scoring e-mail classifier, wherein the outputted message has been classified as non-spam by the global, scoring e-mail classifier; inputting the outputted message from the global, scoring e-mail classifier into the personal, scoring e-mail classifier to classify the at least one outputted message as spam or non-spam; handling the at least one outputted message input into the personal, scoring e-mail classifier based on whether the personal, scoring e-mail classifier classified the at least one outputted message as spam or non-spam; receiving an indication from a user to change the classification of the at least one outputted message; in response to the indication, changing the classification of the at least one outputted message; generating retraining data based on the change to the classification of the at least one outputted message; and retraining the personal, scoring e-mail classifier based on the generated retraining data such that the personal, scoring e-mail classifier'"'"'s internal model is refined to track the user'"'"'s subjective perceptions as to what messages constitute spam messages. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification