Statistical message classifier
First Claim
1. A method for filtering messages, the method comprising:
- receiving a message over a network communication interface;
executing instructions stored in memory, the instructions being executed by a processor to;
process the received message using one or more reliable classifiers that are associated with a higher level of accuracy than at least one trained classifier from a plurality of available classifiers, wherein the one or more reliable classifiers are associated with a feature count,classify the received message using the one or more reliable classifiers and the feature count,track a feature of the classified message based on the classification, wherein the tracked feature and one or more other tracked features are stored in a table and the feature count accounts for a number of times the tracked feature appeared in the classified message, andprocess the received message based on the classification, wherein processing of the received message includes blocking the received message when the received message is classified as spam or allowing the received message to be forwarded to a recipient when the message is classified as a good message;
receiving a new indication that the message is spam or good, the new indication regarding a different feature count associated with a different feature;
updating the trained classifier by updating the feature count in accordance with the different feature count in the new indication;
identifying that a subsequently received message is spam based on the updated feature count and a whitelist count, wherein the whitelist count is associated with a number of times that at least one of the feature or the different feature appears in one or more whitelisted messages; and
blocking the subsequently received message based on the subsequently received message being classified as spam in accordance with the updated feature count.
29 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for improving a statistical message classifier. A message may be tested with a machine classifier, wherein the machine classifier is capable of making a classification on the message. In the event the message is classifiable by the machine classifier, the statistical message classifier is updated according to the reliable classification made by the machine classifier. The message may also be tested with a first classifier. In the event that the message is not classifiable by the first classifier, it is tested with a second classifier, wherein the second classifier is capable of making a second classification. In the event that the message is classifiable by the second classifier, the statistical message classifier is updated according to the second classification.
49 Citations
20 Claims
-
1. A method for filtering messages, the method comprising:
-
receiving a message over a network communication interface; executing instructions stored in memory, the instructions being executed by a processor to; process the received message using one or more reliable classifiers that are associated with a higher level of accuracy than at least one trained classifier from a plurality of available classifiers, wherein the one or more reliable classifiers are associated with a feature count, classify the received message using the one or more reliable classifiers and the feature count, track a feature of the classified message based on the classification, wherein the tracked feature and one or more other tracked features are stored in a table and the feature count accounts for a number of times the tracked feature appeared in the classified message, and process the received message based on the classification, wherein processing of the received message includes blocking the received message when the received message is classified as spam or allowing the received message to be forwarded to a recipient when the message is classified as a good message; receiving a new indication that the message is spam or good, the new indication regarding a different feature count associated with a different feature; updating the trained classifier by updating the feature count in accordance with the different feature count in the new indication; identifying that a subsequently received message is spam based on the updated feature count and a whitelist count, wherein the whitelist count is associated with a number of times that at least one of the feature or the different feature appears in one or more whitelisted messages; and blocking the subsequently received message based on the subsequently received message being classified as spam in accordance with the updated feature count. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium having embodied thereon a program executable by a processor for performing a method for filtering messages, the method comprising:
-
receiving a message over a network communication interface; processing the received message using one or more reliable classifiers that are associated with a higher level of accuracy than at least one other classifier from a plurality of available classifiers, wherein the one or more reliable classifiers are associated with a feature count; classifying the received message using the one or more reliable classifiers and the feature count; tracking a feature of the classified message based on the classification, wherein the tracked feature and one or more other tracked features are stored in a table and the feature count accounts for a number of times the tracked feature appeared in the classified message; processing the received message based on the classification, wherein processing of the received message includes blocking the received message when the received message is classified as spam or allowing the received message to be forwarded to a recipient when the message is classified as a good message; receiving a new indication that the message is spam or good, the new indication regarding a different feature count associated with a different feature; updating the trained classifier by updating the feature count in accordance with the different feature count in the new indication; identifying that a subsequently received message is spam based on the updated feature count and a whitelist count, wherein the whitelist count is associated with a number of times that at least one of the feature or the different feature appears in one or more whitelisted messages; and blocking the subsequently received message based on the subsequently received message being classified as spam in accordance with the updated feature count. - View Dependent Claims (16, 17, 18, 19)
-
-
20. An apparatus for filtering received message, the apparatus comprising:
-
a processor that executes instructions out of the memory to; process the received message using one or more reliable classifiers that are associated with a higher level of accuracy than at least one trained classifier from a plurality of available classifiers, wherein the one or more reliable classifiers are associated with a feature count, classify the received message using the one or more reliable classifiers and the feature count, track a feature of the classified message based on the classification, wherein the tracked feature and one or more other tracked features are stored in a table and the feature count accounts for a number of times the tracked feature appeared in the classified message, and process the received message based on the classification, wherein processing of the received message includes blocking the received message when the received message is classified as spam or allowing the received message to be forwarded to a recipient when the message is classified as a good message; a network interface that receives a new indication that the message is spam or good, the new indication regarding a different feature count associated with a different feature; and memory that stores an update to the trained classifier, wherein the feature count is updated in accordance with the different feature count in the new indication, and wherein the processor identifies that a subsequently received message is spam based on the updated feature count and on a whitelist count, the whitelist count is associated with a number of times that at least one of the feature or the different feature appears in one or more whitelisted messages, and the processor blocks the subsequently received message based on the subsequently received message being classified as spam in accordance with the updated feature count.
-
Specification