Message classification using classifiers
First Claim
1. A method for classifying a message, the method comprising:
- maintaining a table of message features in memory, wherein each message feature corresponds to a good count and a spam count, and wherein the good count is based on a number of times the message feature is associated with a previously received message determined not to be unsolicited and the spam count is based on a number of times the message feature is associated with a previously received message determined to be unsolicited;
receiving a message for analysis;
determining that a sender address of the received message is not in a database of known sender addresses;
identifying one or more message features in the received message;
tracking user classification of previously received messages, wherein the user classification indicates whether the previously received messages are junk or unjunk, the user classification maintained in a table stored in memory;
computing a score for each message feature identified in the received message based on the good count and spam count associated with the identified message feature and user classification of received messages associated with the identified message feature;
determining that the received message is an unsolicited message based on the computed score; and
processing the received message according to the score derived from analysis of the identified message features in the received message.
23 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for improving a statistical message classifier. A message may be tested with a machine classifier, wherein the machine classifier is capable of making a classification on the message. In the event the message is classifiable by the machine classifier, the statistical message classifier is updated according to the reliable classification made by the machine classifier. The message may also be tested with a first classifier. In the event that the message is not classifiable by the first classifier, it is tested with a second classifier, wherein the second classifier is capable of making a second classification. In the event that the message is classifiable by the second classifier, the statistical message classifier is updated according to the second classification.
-
Citations
21 Claims
-
1. A method for classifying a message, the method comprising:
-
maintaining a table of message features in memory, wherein each message feature corresponds to a good count and a spam count, and wherein the good count is based on a number of times the message feature is associated with a previously received message determined not to be unsolicited and the spam count is based on a number of times the message feature is associated with a previously received message determined to be unsolicited; receiving a message for analysis; determining that a sender address of the received message is not in a database of known sender addresses; identifying one or more message features in the received message; tracking user classification of previously received messages, wherein the user classification indicates whether the previously received messages are junk or unjunk, the user classification maintained in a table stored in memory; computing a score for each message feature identified in the received message based on the good count and spam count associated with the identified message feature and user classification of received messages associated with the identified message feature; determining that the received message is an unsolicited message based on the computed score; and processing the received message according to the score derived from analysis of the identified message features in the received message. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An apparatus for classifying a message, the apparatus comprising:
-
a processor configured to execute a program stored in memory, wherein execution of the program by the processor; identifies one or more message features in a received message, tracks user classification of previously received messages, wherein the user classification indicates whether the previously received messages are junk or unjunk and wherein the user classification is maintained in a table stored in memory, computes a score for each identified message feature, the score based on a good count and spam count associated with the identified message feature and the user classification of previously received messages associated with the identified message feature, and processes the received message according to the score derived from analysis of the one or more identified features in the received message; memory configured to store information regarding the one or more identified message features of previously received messages, the information regarding the one or more identified message features used to compute the score for each identified message feature; and a database of known sender addresses, wherein processing of the received message is further based on a determining whether a sender of the received message has a known sender address. - View Dependent Claims (19, 20, 21)
-
Specification