Identification of content by metadata
First Claim
Patent Images
1. A method for identifying content in electronic messages, the method comprising:
- receiving an electronic message over a communication network; and
executing instructions stored in memory, wherein execution of the instructions by a processor;
determines one or more image-based content types of one or more images contained within the electronic message;
associates each of the one or more image-based content types with one or more metadata extraction routines;
extracts metadata from each of the one or more images using the one or more metadata extraction routines;
generates a numerical signature based on the extracted metadata from each of the one or more images, the numerical signature comprising a plurality of numerical values, each numerical value characterizing a different aspect of the one or more images;
generates one or more thumbprints using the numerical signature;
compares the one or more thumbprints to a plurality of thumbprints stored in a thumbprint database, the plurality of thumbprints associated with one or more other electronic messages that have previously been classified as spam;
classifies the electronic message as spam when at least one of the one or more thumbprints matches one of the plurality of thumbprints in the thumbprint database; and
identifies a spam outbreak based on a number of matches identified between the thumbprints of the classified message and thumbprints of the one or more other electronic messages previously classified as spam.
26 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for identifying content in electronic messages are provided. An electronic message may include certain content. The content is detected and analyzed to identify any metadata. The metadata may include a numerical signature characterizing the content. A thumbprint is generated based on the numerical signature. The thumbprint may then be compared to thumbprints of previously received messages. The comparison allows for classification of the electronic message as spam or not spam.
20 Citations
16 Claims
-
1. A method for identifying content in electronic messages, the method comprising:
-
receiving an electronic message over a communication network; and executing instructions stored in memory, wherein execution of the instructions by a processor; determines one or more image-based content types of one or more images contained within the electronic message; associates each of the one or more image-based content types with one or more metadata extraction routines; extracts metadata from each of the one or more images using the one or more metadata extraction routines; generates a numerical signature based on the extracted metadata from each of the one or more images, the numerical signature comprising a plurality of numerical values, each numerical value characterizing a different aspect of the one or more images; generates one or more thumbprints using the numerical signature; compares the one or more thumbprints to a plurality of thumbprints stored in a thumbprint database, the plurality of thumbprints associated with one or more other electronic messages that have previously been classified as spam; classifies the electronic message as spam when at least one of the one or more thumbprints matches one of the plurality of thumbprints in the thumbprint database; and identifies a spam outbreak based on a number of matches identified between the thumbprints of the classified message and thumbprints of the one or more other electronic messages previously classified as spam. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer readable storage medium having embodied thereon a program executable by a processor to perform a method for identifying content in electronic messages, the method comprising:
-
receiving an electronic message over a communication network; determining one or more image-based content types of one or more images contained within the electronic message; associating each of the one or more image-based content types with one or more metadata extraction routines; extracting metadata from each of the one or more images using the one or more metadata extraction routines; generating a numerical signature based on the extracted metadata from each of the one or more images, the numerical signature comprising a plurality of numerical values, each numerical value characterizing a different aspect of the one or more images; generating one or more thumbprints using the numerical signature; comparing the thumbprints to a plurality of thumbprints stored in a thumbprint database, the plurality of thumbprints associated with one or more other electronic messages that have previously been classified as spam; classifying the electronic message as spam when at least one of the one or more thumbprints matches one of the plurality of thumbprints in the thumbprint database; and identifying a spam outbreak based on a number of matches identified between the thumbprints of the classified message and thumbprints of the one or more other electronic messages previously classified as spam. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A system for identifying content in electronic messages, the system comprising:
-
a communication interface that receives an electronic message over a communication network; a processor; and a memory, wherein the processor executes instructions stored in memory, wherein execution of the instructions by the processor; determines one or more image-based content types of one or more images contained within the electronic message; associates each of the one or more image-based content types with one or more metadata extraction routines; extracts metadata from each of the one or more images using the one or more metadata extraction routines; generates a numerical signature based on the extracted metadata from each of the one or more images, the numerical signature comprising a plurality of numerical values, each numerical value characterizing a different aspect of the one or more images; generates one or more thumbprints using the numerical signature; compares the one or more thumbprints to a plurality of thumbprints stored in a thumbprint database, the plurality of thumbprints associated with one or more other electronic messages that have previously been classified as spam; classifies the electronic message as spam when at least one of the one or more thumbprints matches one of the plurality of thumbprints in the thumbprint database; and identifies a spam outbreak based on a number of matches identified between the thumbprints of the classified message and thumbprints of the one or more other messages previously classified as spam. - View Dependent Claims (14, 15, 16)
-
Specification