Bulk electronic message detection by header similarity analysis
First Claim
1. A computer-implemented method for detecting bulk electronic messages, comprising:
- parsing, by a computer system, header fields of at least one electronic message;
associating, by the computer system, at least one constituent unit with each header field to define a set of constituent units for each field;
forming, by the computer system, a collection of constituent unit sets;
finding, by the computer system, cardinality of an intersection between an additional set of constituent units from an additional electronic message and the collection of constituent unit sets, to determine a measure of similarity; and
responsive to the measure of similarity exceeding a predetermined level, concluding, by the computer system, that the electronic messages are bulk electronic messages.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, apparatuses, and computer-readable media for detecting bulk electronic messages using header similarity analysis. Bulk electronic messages can be detected by parsing (115) header fields of an electronic message; associating (120) at least one constituent unit with each header field defining a set of constituent units for each header field; ascertaining (230) a feature vector for each set of constituent units; forming (240) a collection of feature vectors; and computing (250) an inner product from a set of constituent units from an additional electronic message and the collection of feature vectors from the initial electronic message resulting in a measure of similarity between the initial electronic message and the additional electronic message.
-
Citations
18 Claims
-
1. A computer-implemented method for detecting bulk electronic messages, comprising:
-
parsing, by a computer system, header fields of at least one electronic message; associating, by the computer system, at least one constituent unit with each header field to define a set of constituent units for each field; forming, by the computer system, a collection of constituent unit sets; finding, by the computer system, cardinality of an intersection between an additional set of constituent units from an additional electronic message and the collection of constituent unit sets, to determine a measure of similarity; and responsive to the measure of similarity exceeding a predetermined level, concluding, by the computer system, that the electronic messages are bulk electronic messages. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. At least one computer-readable storage medium containing computer program instructions for detecting bulk electronic messages, the computer program instructions performing steps comprising:
-
determining a measure of similarity between headers of at least two electronic messages; calculating a measure of similarity between message content of the at least two electronic messages; and classifying the at least two messages as bulk electronic messages when the measure of similarity between the headers and the measure of similarity between the message content exceed a predetermined level. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. An apparatus having a computer-readable storage medium having computer program instructions embodied therein for detecting bulk electronic messages, the computer program instructions comprising:
an examination module adapted to; determine a measure of similarity between headers of at least two electronic messages; calculate a measure of similarity between message content of the at least two electronic messages; and classify the at least two messages as bulk electronic messages when the measure of similarity between the headers and the measure of similarity between the message content exceed a predetermined level; and coupled communicatively to the examination module, a blocking module adapted to block the transmission of the at least two electronic messages responsive to the at least two electronic messages being classified as bulk electronic messages. - View Dependent Claims (14, 15, 16, 17, 18)
Specification