×

Detecting image spam

  • US 8,763,114 B2
  • Filed: 01/24/2007
  • Issued: 06/24/2014
  • Est. Priority Date: 01/24/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method comprising:

  • receiving an incoming communication associated with a particular entity;

    identifying that the communication contains one or more images;

    determining whether one or more of the images includes a graphic encoding of a textual spam message, wherein the determining includes, for each of the one or more images;

    normalizing the image to generate a normalized image, wherein the normalized image is a representation of the image with at least some noise removed from the image and the normalized image comprises overlapping sub-regions having image data;

    determining whether the image includes graphical encoding of text corresponding to text in known spam;

    determining whether aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam;

    generating fingerprints of the image data from the overlapping sub-regions of the normalized image, wherein the fingerprints specify attributes of the normalized image;

    comparing the fingerprints from the normalized image with fingerprints of known spam images, wherein the comparison comprises for at least one of the fingerprints;

    determining a first measure that the at least one of the fingerprints is similar to at least one fingerprint of a known spam image;

    determining a second measure that the at least one of the fingerprints is similar to at least one fingerprint of a known non-spam image; and

    classifying the image as a spam image or a non-spam image based at least in part on the determination of whether the image includes graphical encoding of text corresponding to text in known spam, the determination of whether the aspect ratio of the image corresponds to a known aspect ratio corresponding to known spam, and the comparison of the fingerprints; and

    updating a reputation score for the particular entity based at least in part upon a result of the classification, wherein the reputation score for the particular entity is further based on a strength of relationship between the particular entity and a first entity having a reputable reputation score and the strength of relationship is based on similarities between content in messages sent by the particular entity and content of messages sent by the first entity.

View all claims
  • 11 Assignments
Timeline View
Assignment View
    ×
    ×