Detecting unwanted electronic mail messages based on probabilistic analysis of referenced resources
First Claim
1. A method, comprising:
- retrieving a whitelist comprising a plurality of first network resource identifiers that have been included in past electronic mail messages;
retrieving a particular first network resource identifier from the whitelist;
generating a first list of properties for the particular first network resource identifier;
training, using the properties, a probabilistic filter;
repeating the extracting, retrieving and training for all the first network resource identifiers in the whitelist;
retrieving a blocklist comprising a plurality of second network resource identifiers that have been included in past electronic mail messages associated with spam or threats;
retrieving a particular second network resource identifier from the blacklist;
generating a second list of properties for the particular second network resource identifier;
training, using the properties, the probabilistic filter;
repeating the extracting, retrieving and training for all the second network resource identifiers in the blacklist.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, detecting unwanted electronic mail messages based on probabilistic analysis of referenced resources comprises receiving a whitelist and a blocklist each having a plurality of network resource identifiers that have appeared in prior messages; retrieving a particular network resource identifier; generating a list of properties for the particular network resource identifier; training a probabilistic filter using the properties; and repeating the retrieving, generating and training for all the network resource identifiers in the whitelist and blocklist. Thereafter, when an electronic mail message is received and contains a URL or other network resource identifier, a spam score or threat score can be generated for the message by testing properties of the network resource identifier using the trained probabilistic filter.
156 Citations
39 Claims
-
1. A method, comprising:
-
retrieving a whitelist comprising a plurality of first network resource identifiers that have been included in past electronic mail messages;
retrieving a particular first network resource identifier from the whitelist;
generating a first list of properties for the particular first network resource identifier;
training, using the properties, a probabilistic filter;
repeating the extracting, retrieving and training for all the first network resource identifiers in the whitelist;
retrieving a blocklist comprising a plurality of second network resource identifiers that have been included in past electronic mail messages associated with spam or threats;
retrieving a particular second network resource identifier from the blacklist;
generating a second list of properties for the particular second network resource identifier;
training, using the properties, the probabilistic filter;
repeating the extracting, retrieving and training for all the second network resource identifiers in the blacklist. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 37)
-
-
10. A computer-readable tangible storage medium carrying one or more sequences of instructions which, when executed by one or more processors, cause the one or more processors to perform:
-
retrieving a whitelist comprising a plurality of first network resource identifiers that have been included in past electronic mail messages;
retrieving a particular first network resource identifier from the whitelist;
generating a first list of properties for the particular first network resource identifier;
training, using the properties, a probabilistic filter;
repeating the extracting, retrieving and training for all the first network resource identifiers in the whitelist;
retrieving a blacklist comprising a plurality of second network resource identifiers that have been included in past electronic mail messages associated with spam or threats;
retrieving a particular second network resource identifier from the blocklist;
generating a second list of properties for the particular second network resource identifier;
training, using the properties, the probabilistic filter;
repeating the extracting, retrieving and training for all the second network resource identifiers in the blocklist. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 39)
-
-
19. An apparatus, comprising:
-
means for retrieving a whitelist comprising a plurality of first network resource identifiers that have been included in past electronic mail messages;
means for retrieving a particular first network resource identifier from the whitelist;
means for generating a first list of properties for the particular first network resource identifier;
means for training, using the properties, a probabilistic filter;
means for repeating execution of the extracting, retrieving and training means for all the first network resource identifiers in the whitelist;
means for retrieving a blocklist comprising a plurality of second network resource identifiers that have been included in past electronic mail messages associated with spam or threats;
means for retrieving a particular second network resource identifier from the blocklist;
means for generating a second list of properties for the particular second network resource identifier;
means for training, using the properties, the probabilistic filter;
means for repeating the extracting, retrieving and training for all the second network resource identifiers in the blacklist. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 38)
-
-
28. An electronic mail server, comprising:
-
one or more processors;
logic encoded in one or more media for execution and when executed operable to cause the one or more processors to perform;
retrieving a whitelist comprising a plurality of first network resource identifiers that have been included in past electronic mail messages;
retrieving a particular first network resource identifier from the whitelist;
generating a first list of properties for the particular first network resource identifier;
training, using the properties, a probabilistic filter;
repeating the extracting, retrieving and training for all the first network resource identifiers in the whitelist;
retrieving a blocklist comprising a plurality of second network resource identifiers that have been included in past electronic mail messages associated with spam or threats;
retrieving a particular second network resource identifier from the blocklist;
generating a second list of properties for the particular second network resource identifier;
training, using the properties, the probabilistic filter;
repeating the extracting, retrieving and training for all the second network resource identifiers in the blacklist. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36)
-
Specification