Detecting image spam
First Claim
Patent Images
1. A computer implemented method comprising:
- receiving an incoming communication associated with a particular entity;
identifying that the communication contains one or more images;
determining whether one or more of the images includes a graphic encoding of a textual spam message, wherein the determining includes, for each of the one or more images;
normalizing the image to generate a normalized image, wherein the normalized image is a representation of the image with at least some noise removed from the image and the normalized image comprises overlapping sub-regions having image data;
determining whether the image includes graphical encoding of text corresponding to text in known spam;
determining whether aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam;
generating fingerprints of the image data from the overlapping sub-regions of the normalized image, wherein the fingerprints specify attributes of the normalized image;
comparing the fingerprints from the normalized image with fingerprints of known spam images, wherein the comparison comprises for at least one of the fingerprints;
determining a first measure that the at least one of the fingerprints is similar to at least one fingerprint of a known spam image;
determining a second measure that the at least one of the fingerprints is similar to at least one fingerprint of a known non-spam image; and
classifying the image as a spam image or a non-spam image based at least in part on the determination of whether the image includes graphical encoding of text corresponding to text in known spam, the determination of whether the aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam, and the comparison of the fingerprints; and
updating a reputation score for the particular entity based at least in part upon a result of the classification, wherein the reputation score for the particular entity is further based on a strength of relationship between the particular entity and a first entity having a reputable reputation score and the strength of relationship is based on similarities between content in messages sent by the particular entity and content of messages sent by the first entity.
11 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for operation upon one or more data processors for detecting image spam by detecting an image and analyzing the content of the image to determine whether the incoming communication comprises an unwanted communication.
699 Citations
40 Claims
-
1. A computer implemented method comprising:
-
receiving an incoming communication associated with a particular entity; identifying that the communication contains one or more images; determining whether one or more of the images includes a graphic encoding of a textual spam message, wherein the determining includes, for each of the one or more images; normalizing the image to generate a normalized image, wherein the normalized image is a representation of the image with at least some noise removed from the image and the normalized image comprises overlapping sub-regions having image data; determining whether the image includes graphical encoding of text corresponding to text in known spam; determining whether aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam; generating fingerprints of the image data from the overlapping sub-regions of the normalized image, wherein the fingerprints specify attributes of the normalized image; comparing the fingerprints from the normalized image with fingerprints of known spam images, wherein the comparison comprises for at least one of the fingerprints; determining a first measure that the at least one of the fingerprints is similar to at least one fingerprint of a known spam image; determining a second measure that the at least one of the fingerprints is similar to at least one fingerprint of a known non-spam image; and classifying the image as a spam image or a non-spam image based at least in part on the determination of whether the image includes graphical encoding of text corresponding to text in known spam, the determination of whether the aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam, and the comparison of the fingerprints; and updating a reputation score for the particular entity based at least in part upon a result of the classification, wherein the reputation score for the particular entity is further based on a strength of relationship between the particular entity and a first entity having a reputable reputation score and the strength of relationship is based on similarities between content in messages sent by the particular entity and content of messages sent by the first entity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. An image spam detection system, comprising:
-
at least one processor; at least one memory element; a communications interface operable to receive a communication via a network, wherein the communication is associated with a particular entity; a detector operable, when executed by the at least one processor, to identify whether the communication comprises an image; an analyzer operable, when executed by the at least one processor, to, in response to the identification that the communication comprises an image, determine whether the image corresponds to a graphic encoding of a textual spam message, wherein the determining includes; normalizing the image to generate a normalized image, wherein the normalized image is a representation of the image with at least some noise removed from the image and the normalized image comprises overlapping sub-regions having image data; determining whether the image includes graphical encoding of text corresponding to text in known spam; determining whether aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam; generating fingerprints of the image data from the overlapping sub-regions of the normalized image, wherein the fingerprints specify attributes of the normalized image; comparing the fingerprints from the normalized image with fingerprints of known spam images, wherein the comparison comprises for at least one of the fingerprints; determining a first measure that the at least one of the fingerprints is similar to at least one fingerprint of a known spam image; determining a second measure that the at least one of the fingerprints is similar to at least one fingerprint of a known non-spam image; and classifying the image as a spam image or a non-spam image based at least in part on the determination of whether the image includes graphical encoding of text corresponding to text in known spam, the determination of whether the aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam, and the comparison of the fingerprints; and a communications control engine operable to determine an action to perform with respect to the communication based at least in part upon results of the classification performed by the analyzer, wherein the action comprises updating a reputation score for the particular entity based at least in part upon a result of the classification, the reputation score for the particular entity is further based on a strength of relationship between the particular entity and a first entity having a reputable reputation score and the strength of relationship is based on similarities between content in messages sent by the particular entity and content of messages sent by the first entity. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
40. At least one machine accessible storage medium having instructions stored thereon, the instructions when executed on a machine, cause the machine to:
-
identify a communication that contains one or more images and is associated with a particular entity; for each of the one or more images; normalize the image to generate a normalized image, wherein the normalized image is a representation of the image with at least some noise removed from the image and the normalized image comprises overlapping sub-regions having image data; determine whether the image includes graphical encoding of text corresponding to text in known spam; determine whether aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam; generate fingerprints of the image data from the overlapping sub-regions of the normalized image, wherein the fingerprints specify attributes of the normalized image; compare the fingerprints from the normalized image with fingerprints of known spam images, wherein the comparison comprises for at least one of the fingerprints; determining a first measure that the at least one of the fingerprints is similar to at least one fingerprint of a known spam image; determining a second measure that the at least one of the fingerprints is similar to at least one fingerprint of a known non-spam image; and classify the image as a spam image or a non-spam image based at least in part on the determination of whether the image includes graphical encoding of text corresponding to text in known spam, the determination of whether the aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam, and the comparison of the fingerprints; and perform an action on the communication based at least in part upon a result of the classifications, wherein the action comprises updating a reputation score for the particular entity based at least in part upon a result of the classification, wherein the reputation score for the particular entity is further based on a strength of relationship between the particular entity and a strength of relationship between the particular entity and a first entity having a reputable reputation score and the strength of relationship is based on similarities between content in messages sent by the particular entity and content of messages sent by the first entity.
-
Specification