IMAGE SPAM FILTERING BASED ON SENDERS' INTENTION ANALYSIS
First Claim
1. A method comprising:
- converting an embedded image of an electronic mail (email) message to a binarized representation by performing thresholding on a grayscale representation of the embedded image;
determining and measuring a quantity of text included in the embedded image by analyzing one or more blocks of the binarized representation; and
classifying the email message as spam or clean based at least in part on the quantity of text measured.
0 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for an anti-spam detection module that can detect image spam are provided. According to one embodiment, an image spam detection process involves determining and measuring various characteristics of images that may be embedded within or otherwise associated with an electronic mail (email) message. An approximate display location of the embedded images is determined. The existence of one or more abnormal factors associated with the embedded images is identified. A quantity of text included in the one or more embedded images is determined and measured by analyzing one or more blocks of binarized representations of the one or more embedded images. Finally, the likelihood that the email message is spam is determined based on one or more of the approximate display location, the existence of one or more abnormal factors and the quantity and location of text measured.
-
Citations
18 Claims
-
1. A method comprising:
-
converting an embedded image of an electronic mail (email) message to a binarized representation by performing thresholding on a grayscale representation of the embedded image; determining and measuring a quantity of text included in the embedded image by analyzing one or more blocks of the binarized representation; and classifying the email message as spam or clean based at least in part on the quantity of text measured. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
determining an approximate display location of one or more embedded images within an electronic mail (email) message; identifying existence of one or more abnormal factors associated with the one or more embedded images; determining and measuring a quantity of text included in the one or more embedded images by analyzing one or more blocks of binarized representations of the one or more embedded images; and classifying the email message as spam or clean based on the approximate display location, the existence of one or more abnormal factors and the quantity of text measured. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification