×

System and method for identifying text-based SPAM in rasterized images

  • US 7,706,613 B2
  • Filed: 08/23/2007
  • Issued: 04/27/2010
  • Est. Priority Date: 08/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for identifying spam in an image, the method comprising:

  • (a) identifying a plurality of contours in the image, the contours corresponding to probable symbols;

    (b) ignoring contours that are too small or too large;

    (c) identifying text lines in the image, based on the remaining contours;

    (d) parsing the text lines into words;

    (e) ignoring words that are too short or too long from the identified text lines;

    (f) ignoring text lines that are too short;

    (g) verifying that the image contains text by comparing a number of pixels of a symbol color within remaining contours to a total number of pixels of the symbol color in the image; and

    (h) if the image contains text, rendering a spam/no spam verdict based on a contour representation of the text that remains after step (f).

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×