Email filtering system and method
First Claim
Patent Images
1. A method, comprising the steps of:
- a) building a general mail corpus, a general spam corpus, a user mail corpus and a user spam corpus,b) building a general probability table based on said general mail corpus and said general spam corpus, wherein said general probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message,c) building a user probability table based on said user mail corpus and said user spam corpus, wherein said user probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message,d) receiving an email message,e) extracting a link from said email message,f) downloading a content of a resource referred by said link,g) parsing said content into a plurality of tokens,h) finding a token score for each token in said plurality of tokens, comprising the steps of;
h1) searching said user probability table for each token,h2) if said token is not listed in said user probability table, searching said general probability table for said token, andh3) if said token is not listed in said general probability table, ignoring said token or setting said token to a nominal value,i) determining a desirability value for said link, andj) routing said email message based on said desirability value.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of the present invention allow filtering out spam and phishing email messages based on the links embedded into the email messages. In a preferred embodiment, an Email Filter extracts links from the email message and obtains desirability values for the links. The Email Filter may route the email message based on desirability values. Such routing includes delivering the email message to a Recipient, delivering the message to a Quarantine Mailbox, or deleting the message.
124 Citations
29 Claims
-
1. A method, comprising the steps of:
-
a) building a general mail corpus, a general spam corpus, a user mail corpus and a user spam corpus, b) building a general probability table based on said general mail corpus and said general spam corpus, wherein said general probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message, c) building a user probability table based on said user mail corpus and said user spam corpus, wherein said user probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message, d) receiving an email message, e) extracting a link from said email message, f) downloading a content of a resource referred by said link, g) parsing said content into a plurality of tokens, h) finding a token score for each token in said plurality of tokens, comprising the steps of; h1) searching said user probability table for each token, h2) if said token is not listed in said user probability table, searching said general probability table for said token, and h3) if said token is not listed in said general probability table, ignoring said token or setting said token to a nominal value, i) determining a desirability value for said link, and j) routing said email message based on said desirability value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method, comprising the steps of:
-
a) building a general mail corpus, a general spam corpus, a user mail corpus and a user spam corpus, b) building a general probability table based on said general mail corpus and said general spam corpus, wherein said general probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message, c) building a user probability table based on said user mail corpus and said user spam corpus, wherein said user probability table comprises a list of tokens and corresponding probabilities of a token being a part of a spam email message, d) a Sender transmitting an email message addressed to a Recipient, e) an Email Filter receiving said email message, f) said Email Filter extracting a link from said email message, g) downloading a content of a resource referred by said link, h) parsing said content into a plurality of tokens, i) finding a token score for each token in said plurality of tokens, comprising the steps of; i1) searching said user probability table for each token, i2) if said token is not listed in said user probability table, searching said general probability table for said token, and i3) if said token is not listed in said general probability table, ignoring said token or setting said token to a nominal value, j) a Link Characterization Means determining a desirability value for said link, and k) said Email Filter routing said email message based on said desirability value. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
Specification