Bulk electronic message detection by header similarity analysis
First Claim
1. A method for detecting bulk electronic messages, comprising:
- parsing, by a computer, header fields of at least one electronic message;
associating, by the computer, at least one constituent unit with each header field to define a set of constituent units for each header field;
ascertaining, by the computer, a feature vector for each set of constituent units;
forming, by the computer, a collection of feature vectors; and
computing, by the computer, an inner product of an additional feature vector from an additional electronic message and the collection of feature vectors to determine similarity of the additional electronic message to the at least one electronic message.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, apparatuses, and computer-readable media for detecting bulk electronic messages using header similarity analysis. Bulk electronic messages can be detected by parsing (115) header fields of an electronic message; associating (120) at least one constituent unit with each header field defining a set of constituent units for each header field; ascertaining (230) a feature vector for each set of constituent units; forming (240) a collection of feature vectors; and computing (250) an inner product from a set of constituent units from an additional electronic message and the collection of feature vectors from the initial electronic message resulting in a measure of similarity between the initial electronic message and the additional electronic message.
-
Citations
14 Claims
-
1. A method for detecting bulk electronic messages, comprising:
-
parsing, by a computer, header fields of at least one electronic message; associating, by the computer, at least one constituent unit with each header field to define a set of constituent units for each header field; ascertaining, by the computer, a feature vector for each set of constituent units; forming, by the computer, a collection of feature vectors; and computing, by the computer, an inner product of an additional feature vector from an additional electronic message and the collection of feature vectors to determine similarity of the additional electronic message to the at least one electronic message. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for detecting bulk electronic messages, comprising:
-
determining, by a computer, a measure of similarity between headers of at least two electronic messages, wherein determining the measure of similarity between headers comprises; parsing, by the computer, header fields of the least two electronic messages; associating, by the computer, at least one constituent unit with each header field to define a set of constituent units for each header field; ascertaining, by the computer, a feature vector for each set of constituent units; forming, by the computer, a collection of feature vectors for each electronic message; and computing, by the computer, an inner product of the collections of feature vectors to determine the measure of similarity between headers; calculating, by the computer, a measure of similarity between message content of the at least two electronic messages; and classifying, by the computer, the at least two messages as bulk electronic messages when the measure of similarity between the headers and the measure of similarity between the message content exceed a predetermined level. - View Dependent Claims (8, 9)
-
-
10. At least one computer-readable medium containing computer program instructions for detecting bulk electronic messages, the computer program instructions performing steps comprising:
-
parsing header fields of at least one electronic message; associating at least one constituent unit with each header field to define a set of constituent units for each header field; ascertaining a feature vector for each set of constituent units; forming a collection of feature vectors; and computing an inner product of an additional feature vector from an additional electronic message and the collection of feature vectors to determine similarity of the additional electronic message to the at least one electronic message. - View Dependent Claims (11)
-
-
12. An apparatus interposed between a client computer and an electronic message server for detecting bulk electronic messages, the apparatus comprising:
-
a computer processor; a computer-readable storage medium storing instructions that when executed by the computer processor configure the processor to; parse header fields of at least one electronic message; associate at least one constituent unit with each header field, defining a set of constituent units for each header field; ascertain a feature vector for each set of constituent units; form a collection of feature vectors; and compute an inner product of an additional feature vector from an additional electronic message and the collection of feature vectors to determine similarity of the additional electronic message to the at least one electronic message. - View Dependent Claims (13, 14)
-
Specification