×

Identification of content by metadata

  • US 8,578,485 B2
  • Filed: 03/01/2010
  • Issued: 11/05/2013
  • Est. Priority Date: 12/31/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for identifying content in electronic messages, the method comprising:

  • receiving information regarding an electronic message having been classified as spam, the classification based on one or more thumbprints, wherein deriving each thumbprint comprises;

    extracting metadata characterizing one or more images in the electronic message,generating a numerical signature based on the extracted metadata characterizing the one or more images in the classified message, the numerical signature comprising a plurality of numerical values, each numerical value characterizing a different aspect of the one or more images, the different aspects including dimension, location of color, and intensity of color in a color table, andcompressing the numerical signature;

    executing instructions stored in memory, wherein execution of the instructions by a processor searches for matches to the thumbprints of the classified message, wherein the search comprises;

    comparing the thumbprints of the classified message to thumbprints associated with a plurality of other electronic messages;

    identifying a match between at least one of the thumbprints of the classified message and at least one of the plurality of thumbprints, the at least one thumbprint associated with at least one of the other messages;

    retrieving the at least one other electronic message associated with the at least one thumbprint matching the thumbprint of the classified message;

    classifying the at least one retrieved electronic message as spam based on the identified match; and

    identifying a spam outbreak based on a number of matches identified between the thumbprints of the classified message and thumbprints of the other messages.

View all claims
  • 24 Assignments
Timeline View
Assignment View
    ×
    ×