Mail server probability spam filter
First Claim
1. A system for filtering spam emails from a stream of emails, comprising:
- A) a mail server;
B) a spam filter residing on the mail sever; and
C) an internet interface for receiving emails from a sender and for transmitting emails to a client.
5 Assignments
0 Petitions
Accused Products
Abstract
Email senders may transmit emails over the internet to a mail server that handles emails for a plurality of users (clients). The mail server may use a spam filter to remove the spam and then transmit the filtered emails to the addressed clients. The spam filter may use a white list, black list, probability filter and keyword filter. The probability filter may use a user mail corpus and a user spam corpus for creating a user probability table that lists tokens and the probability that an email is a spam if the email contains the token. The probability filter may also use a general mail corpus and a general spam corpus for creating a general probability table that. Tokens of incoming emails may be searched for in the user probability table, and if not found, the general probability table to calculate the probability that the email is a spam.
156 Citations
18 Claims
-
1. A system for filtering spam emails from a stream of emails, comprising:
-
A) a mail server;
B) a spam filter residing on the mail sever; and
C) an internet interface for receiving emails from a sender and for transmitting emails to a client. - View Dependent Claims (2, 3, 4)
-
-
5. A process for filtering spam emails from a stream of emails, comprising the steps of:
-
A) receiving a stream of emails into a mail server from a plurality of third parties;
B) filtering out the spam from the stream of emails with a spam filter; and
C) transmitting the stream of filtered emails to a plurality of clients. - View Dependent Claims (6, 7, 8)
-
-
9. A process for filtering spam emails from a stream of emails, comprising the steps of:
-
A) building a general mail corpus, a general spam corpus;
a user mail corpus and a user spam corpus;
B) building a general probability table based on the general mail corpus and the general spam corpus, wherein the general probability table includes a list of tokens and a corresponding list of probabilities that an email is a spam for containing the corresponding token;
C) building a user probability table based on the user mail corpus and the user spam corpus;
wherein the user probability table includes a list of tokens and a corresponding list of probabilities that an email is a spam for containing the corresponding token;
D) receiving an email;
E) parsing the email into a plurality of tokens;
F) finding a token score for each token in the plurality of tokens, comprising the steps of;
a) searching the user probability table for each token;
b) if a token is not listed in the user probability table, searching the general probability table for that token; and
c) if the token is also not listed in the general probability table, ignoring the token or setting the token to a nominal value;
G) determining an email score, wherein the email score approximates a probability the email is a spam for containing all or some subset of the tokens in the email;
H) if the email score is above a predetermined value, diverting the email; and
I) if the email score is not above the predetermined value, transmitting the email to a user. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A process for filtering spam emails from a stream of emails, comprising the steps of:
-
A) receiving an incoming email from a stream of emails;
B) using a white list filter, wherein if a sender is on a white list, step B) further comprises the steps of;
a) transmitting the email to a user; and
b) skipping to step F);
C) using a black list filter, wherein if the sender is on a black list, step C) further comprises the steps of;
a) diverting the email; and
b) skipping to step F);
D) using a probability filter to determine an email score;
wherein if the email score is greater than a predetermined value, step D) further comprises the steps of;
a) diverting the email; and
b) skipping to step F);
E) transmitting the email to the user; and
F) repeating steps A) through F) as long as there are more unprocessed emails in the stream of emails. - View Dependent Claims (15, 16, 17, 18)
-
Specification