Apparatus and method to identify SPAM emails
First Claim
1. A method to identity SPAM emails, comprising:
- providing a pseudo user device capable of communicating with a user device;
setting a misspelling rejection ratio;
receiving an email comprising (X) words by said pseudo user device;
determining a number (Y) of misspelled words comprising said email;
calculating a misspelling ratio by dividing (Y) by (X);
determining if said misspelling ratio is greater than or equal to said misspelling rejection ratio;
operative if said misspelling ratio is greater than or equal to said misspelling rejection ratio, reporting said email as SPAM,operative if said misspelling ratio is not greater than or equal to said misspelling rejection ratio;
providing said email by said pseudo user device to said user device;
performing a fuzzy word screen of said email;
wherein said performing a fuzzy word screen step comprises;
providing a library of prohibited words, an identity count, and a non-identity count, wherein said identity count and said non-identity count are initially set to 0;
setting a rejection identity count/non-identity count ratio;
selecting a (k)th word from said email, wherein (k) is greater than or equal to 1 and less than or equal to (X);
determining that the (k)th word comprises (A) characters;
retrieving from said library (M) prohibited words comprising (A) characters;
selecting the (j)th prohibited word, wherein (j) is greater than or equal to 1 and less than or equal to (M);
comparing, for each value of (i), the (i)th character of the (k)th email word with the (i)th character of the (j)th prohibited word, wherein (i) is greater than or equal to 1 and less than or equal to (A);
operative if the (i)th character of the (k)th email word is the same as the (i)th character of the (j)th prohibited word, incrementing said identity count;
operative if the (i)th character of the (k)th email word is not the same as the (i)th character of the (j)th prohibited word, incrementing said non-identity count;
calculating an actual identity count/non-identity count ratio;
determining if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio;
operative if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio, reporting said email as SPAM.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus to identity SPAM emails is disclosed. The method sets a misspelling rejection ratio. Upon receipt of an email comprising (X) words, the method determines the number (Y) of misspelled words comprising that email. The method then calculates a misspelling ratio by dividing (Y) by (X), and then determines if the misspelling ratio is greater than or equal to the misspelling rejection ratio. If the method determines that the misspelling ratio is greater than or equal to the misspelling rejection ratio, then the method reports the email as SPAM. In certain embodiments, the detection of words used to trigger the rejection of SPAM is based on a fuzzy search of alternate spellings. These alternate spellings may come from a spell checker.
-
Citations
8 Claims
-
1. A method to identity SPAM emails, comprising:
-
providing a pseudo user device capable of communicating with a user device; setting a misspelling rejection ratio; receiving an email comprising (X) words by said pseudo user device; determining a number (Y) of misspelled words comprising said email; calculating a misspelling ratio by dividing (Y) by (X); determining if said misspelling ratio is greater than or equal to said misspelling rejection ratio; operative if said misspelling ratio is greater than or equal to said misspelling rejection ratio, reporting said email as SPAM, operative if said misspelling ratio is not greater than or equal to said misspelling rejection ratio; providing said email by said pseudo user device to said user device; performing a fuzzy word screen of said email; wherein said performing a fuzzy word screen step comprises; providing a library of prohibited words, an identity count, and a non-identity count, wherein said identity count and said non-identity count are initially set to 0; setting a rejection identity count/non-identity count ratio; selecting a (k)th word from said email, wherein (k) is greater than or equal to 1 and less than or equal to (X); determining that the (k)th word comprises (A) characters; retrieving from said library (M) prohibited words comprising (A) characters; selecting the (j)th prohibited word, wherein (j) is greater than or equal to 1 and less than or equal to (M); comparing, for each value of (i), the (i)th character of the (k)th email word with the (i)th character of the (j)th prohibited word, wherein (i) is greater than or equal to 1 and less than or equal to (A); operative if the (i)th character of the (k)th email word is the same as the (i)th character of the (j)th prohibited word, incrementing said identity count; operative if the (i)th character of the (k)th email word is not the same as the (i)th character of the (j)th prohibited word, incrementing said non-identity count; calculating an actual identity count/non-identity count ratio; determining if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio; operative if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio, reporting said email as SPAM. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program encoded in an information storage medium and usable with a programmable computer processor to identity SPAM emails, comprising:
-
a library of prohibited words; computer readable program code which causes said programmable computer processor to retrieve a pre-determined misspelling rejection ratio; computer readable program code which causes said programmable computer processor to receive an email comprising (X) words; computer readable program code which causes said programmable computer processor to determine a number (Y) of misspelled words comprising said email; computer readable program code which causes said programmable computer processor to calculate a misspelling ratio by dividing (Y) by (X); computer readable program code which causes said programmable computer processor to determine if said misspelling ratio is greater than or equal to said misspelling rejection ratio; computer readable program code which, if said misspelling ratio is greater than or equal to said misspelling rejection ratio, causes said programmable computer processor to report said email as SPAM; computer readable program code which, if said misspelling ratio is not greater than or equal to said misspelling rejection ratio, causes said programmable computer processor to provide said email to a user device; computer readable program code which causes said programmable computer processor to maintain an identity count, and a non-identity count, wherein said identity count and said non-identity count are initially set to 0; computer readable program code which causes said programmable computer processor to retrieve a pre-determined rejection identity count/non-identity count ratio; computer readable program code which causes said programmable computer processor to select the (k)th word from said email, wherein said (k)th word comprises (A) characters, and wherein (k) is greater than or equal to 1 and less than or equal to (X); computer readable program code which causes said programmable computer processor to retrieve from said library (M) prohibited words comprising (A) characters; computer readable program code which causes said programmable computer processor to select the (j)th prohibited word, wherein (j) is greater than or equal to 1 and less than or equal to (M); computer readable program code which causes said programmable computer processor to compare, for each value of (i), the (i)th character of the (k)th email word with the (i)th character of the (j)th prohibited word, wherein (i) is greater than or equal to 1 and less than or equal to (A); computer readable program code which, if the (i)th character of the (k)th email word is the same as the (i)th character of the (j)th prohibited word, causes said programmable computer processor to increment said identity count; computer readable program code which, if the (i)th character of the (k)th email word is not the same as the (i)th character of the (j)th prohibited word, causes said programmable computer processor to increment said non-identity count; computer readable program code which causes said programmable computer processor to calculate an actual identity count/non-identity count ratio; computer readable program code which causes said programmable computer processor to determine if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio; computer readable program code which, if said actual identity count/non-identity count ratio is greater than or equal to said rejection identity count/non-identity count ratio, causes said programmable computer processor to report said email as SPAM. - View Dependent Claims (6, 7, 8)
-
Specification