Method and system for detecting undesired email containing image-based messages
First Claim
1. A computer-implemented method for detecting undesired email containing image-based messages, the method comprising the steps of:
- (i) issuing, at an email security device, appropriate tokens and token values for each of a pre-selected set of text-related characteristics of an email under consideration;
(ii) issuing, at said email security device, appropriate tokens and token values for each of a pre-selected set of image-related characteristics of an image in the email under consideration, at least one token associated with an image-related characteristic being dependent upon a result of image processing analysis on data comprising the image;
(iii) performing, at said email security device, a statistical analysis of the tokens associated with text-related and image-related characteristics issued for the email under consideration and determining a probability that the email is undesired; and
(iv) employing, at said email security device, the determined probability to decide whether the email is undesired.
15 Assignments
0 Petitions
Accused Products
Abstract
A system and method of detecting undesired email containing image-based messages employs a statistical analysis process which identifies and assigns probability values to the presence of each of a pre-selected set of text-related characteristics of an email under consideration and to the presence of each of a pre-selected set of image-related characteristics of the email under consideration. The identified characteristics and their associated probability values are used to determine whether the email is undesired. In one embodiment, the identification and assignment of probability values is a Bayesian analysis and, preferably, a Statistical Token Analysis. The system and method can identify undesired emails which contain images having messages, generally in the form of text in the image.
24 Citations
15 Claims
-
1. A computer-implemented method for detecting undesired email containing image-based messages, the method comprising the steps of:
-
(i) issuing, at an email security device, appropriate tokens and token values for each of a pre-selected set of text-related characteristics of an email under consideration; (ii) issuing, at said email security device, appropriate tokens and token values for each of a pre-selected set of image-related characteristics of an image in the email under consideration, at least one token associated with an image-related characteristic being dependent upon a result of image processing analysis on data comprising the image; (iii) performing, at said email security device, a statistical analysis of the tokens associated with text-related and image-related characteristics issued for the email under consideration and determining a probability that the email is undesired; and (iv) employing, at said email security device, the determined probability to decide whether the email is undesired.
-
-
2. The method of claim 1 wherein step (ii) further comprises the step of, if the email under consideration includes an image, issuing an appropriate token and token value for each of a pre-selected set of email structure characteristics of the email under consideration.
-
3. The method of claim 1 wherein the issuance of at least one token in step (ii) is dependent upon the result of processing the image with a convolutional filter.
-
4. The method of claim 1 wherein the statistical analysis of step (iii) is performed by a first STA process for the tokens issued in step (i) and is performed by a second STA process for the tokens issued in step (ii) and the determined probability employs the an output from both the first STA process and the second STA process.
-
5. A computer-implemented method for detecting undesired email containing image-based messages, the method comprising the steps of:
-
(i) determining, at an email security device, the presence of each of a pre-selected set of text-related characteristics of an email under consideration and assigning a probability value to each determined characteristic; (ii) determining the presence of each of a pre-selected set of image-related characteristics of the email under consideration and assigning a probability value to each determined characteristic, at least one image-related characteristic being dependent upon a result of image processing analysis on data comprising the image; (iii) performing, at said email security device, a statistical analysis of the email under consideration using the determined text-related and image-related characteristics and their associated probability values to determine a probability that the email is undesired; and (iv) employing, at said email security device, the determined probability to decide whether the email is undesired.
-
-
6. The method of claim 5 wherein a Bayesian probability analysis is employed in step (iii).
-
7. The method of claim 6 wherein the Bayesian probability analysis is a Statistical Token Analysis and the determining of the presence of characteristics and assigning of probability values thereto comprises the issuance of appropriate tokens with corresponding values.
-
8. The method of claim 5 wherein step (ii) further comprises the step of, if the email under consideration includes an image, determining the presence of each of a pre-selected set of email structure characteristics of the email under consideration and assigning a probability value to each determined characteristic email structure characteristic.
-
9. The method of claim 5 wherein at least one of the pre-selected set of image-related characteristics is dependent upon the result of processing the image with a convolutional filter.
-
10. The method of claim 7 wherein the statistical analysis of step (iii) is performed by a first STA process for the tokens issued in step (i) and is performed by a second STA process for the tokens issued in step (ii) and the determined probability employs the an output from both the first STA process and the second STA process.
-
11. A system for detecting undesired email containing image-based messages, the system comprising:
-
at least one incoming email server operable to receive emails; an email security device connected to a network, the email security device being operable to analyze emails received over the network and to forward received emails to said at least one incoming email server; and an undesired email detection process cooperating with the email security device and operable to; determine the presence of each of a pre-selected set of text-related characteristics of an email under consideration and assigning a probability value to each determined characteristic; determining the presence of each of a pre-selected set of image-related characteristics of the email under consideration and assigning a probability value to each determined characteristic, at least one image-related characteristic being dependent upon a result of image processing analysis on data comprising the image; perform a statistical analysis of the email under consideration using the determined text-related and image-related characteristics and their associated probability values to determine a probability that the email is undesired; and wherein the email security device employs the determined probability to decide whether the email is undesired.
-
-
12. The system of claim 11 where, when the email security device has decided an email is undesired, it forwards the undesired email to the incoming email server with an indication that the email is undesired.
-
13. The system of claim 11 wherein the statistical analysis is a Bayesian analysis.
-
14. The system of claim 13 wherein the Bayesian probability analysis is a Statistical Token Analysis (STA) and the determining of the presence of characteristics and assigning of probability values thereto comprises the issuance of appropriate tokens with corresponding values.
-
15. The system of claim 14 wherein a first STA process is performed for the pre-selected set of text-related characteristics and a second STA process is performed for the pre-selected set of image-related characteristics and wherein the email security device employs the determined probabilities from each of the first STA process and the second STA process to decide whether the email is undesired.
Specification