Identifying IP addresses for spammers
First Claim
1. A method for use in managing delivery of content over a network, comprising:
- determining a plurality of IP address groups from a plurality of messages received from a plurality of different IP addresses, the plurality of messages being grouped based on message volumes received during a given time period for each of the different IP addresses;
determining an entropy for each IP address group determined to be associated with non-spam messages;
determining a distribution of messages for a plurality of messages received from a given IP address by distributing the messages from the given IP address across a plurality of message buckets based on a size of each message;
determining a percentage of messages within each message bucket within the plurality of message buckets by determining a number of messages in a given message bucket divided by a total number of messages for the given IP address;
determining a message entropy for the given IP address by summing the percentages of messages times a log of the percentages of messages over all message buckets;
selecting an entropy as a threshold value from an IP address group associated with non-spam messages based on a comparison of a message volume for the given IP address;
if the message entropy for the given IP address is determined to be statistically significant from the selected entropy threshold value, then classifying the given IP address as a potential spammer; and
selectively inhibit sending of messages from the given IP address to message recipients.
9 Assignments
0 Petitions
Accused Products
Abstract
Detecting and blocking spam messages using statistical analysis on distributions of message sizes for a given IP address. Mail volumes are examined to model a distribution of volumes to cluster IP addresses. The messages sizes may distributed across ranges of message sizes, which is then used to determine an entropy of message sizes for the given IP address. The entropy of the given IP address may be compared to entropies of known good IP addresses, and if a difference between the entropies is statistically significant, then the given IP address may be determined to be an IP spammer. User feedback may also be employed to further characterize an IP address. For example, a number of messages from the IP address may be sent to intended recipients. User feedback may then be monitored to determine whether to the IP address should be reclassified.
44 Citations
17 Claims
-
1. A method for use in managing delivery of content over a network, comprising:
-
determining a plurality of IP address groups from a plurality of messages received from a plurality of different IP addresses, the plurality of messages being grouped based on message volumes received during a given time period for each of the different IP addresses; determining an entropy for each IP address group determined to be associated with non-spam messages; determining a distribution of messages for a plurality of messages received from a given IP address by distributing the messages from the given IP address across a plurality of message buckets based on a size of each message; determining a percentage of messages within each message bucket within the plurality of message buckets by determining a number of messages in a given message bucket divided by a total number of messages for the given IP address; determining a message entropy for the given IP address by summing the percentages of messages times a log of the percentages of messages over all message buckets; selecting an entropy as a threshold value from an IP address group associated with non-spam messages based on a comparison of a message volume for the given IP address; if the message entropy for the given IP address is determined to be statistically significant from the selected entropy threshold value, then classifying the given IP address as a potential spammer; and selectively inhibit sending of messages from the given IP address to message recipients. - View Dependent Claims (2, 3, 4)
-
-
5. A network device for managing delivery of messages over a network, comprising:
-
a transceiver to send and receive data over the network; and a processor that is operative to perform actions, including; determining a plurality of IP address groups from a plurality of messages received from a plurality of different IP addresses, the plurality of messages being grouped based on message volumes received during a given time period for each of the different IP addresses; determining an entropy for each IP address group determined to be associated with non-spam messages; receiving a plurality of messages, each message being from a given IP address of a message sender; determining a distribution of messages for the plurality of messages received from the given IP address by distributing the messages from the given IP address across a plurality of message buckets based on a size of each message; determining a percentage of messages within each message bucket within the plurality of message buckets by determining a number of messages in a given message bucket divided by a total number of messages for the given IP address; determining a message entropy for the given IP address by summing the percentages of messages times a log of the percentages of messages over all message buckets; selecting an entropy as a threshold from an IP address group associated with non-spam messages based on a comparison of message volumes for the given IP address and the IP address group; and if the message entropy indicates that the given IP address is a bulk mailer based on a comparison of the selected entropy threshold value, selectively inhibit sending of a subsequent message from the given IP address to a message recipient. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A system for use in managing delivery of messages over a network, comprising:
-
a message server that is configured to receive messages from a plurality of different IP addresses, each message being destined to at least one message recipient; and a message spam detector configured and arranged to communicate with the message server and to perform actions, including; determining a plurality of IP address groups from a plurality of messages received from the plurality of different IP addresses, the plurality of messages being grouped based on message volumes received during a given time period for each of the different IP addresses; determining an entropy for each IP address group determined to be associated with non-spam messages; receiving a plurality of messages from a given IP address within the plurality of IP addresses; determining a distribution of messages for the plurality of messages from the given IP address by distributing the messages from the given IP address across a plurality of message buckets based on ranges of sizes of messages determining a percentage of messages within each message bucket within the plurality of message buckets by determining a number of messages in a given message bucket divided by a total number of messages for the given IP address; determining a message entropy for the given IP address as a negative sum of a percentage of messages times a log of the percentage of messages over all of the portions of the distribution; selecting an entropy as a threshold from an IP address group associated with non-spam messages based on a comparison of a message volume for the given IP address; and if the message entropy indicates the given IP address is a bulk mailer based on a comparison of the message entropy for the given IP address to the selected entropy threshold, selectively inhibit sending of a subsequent message from the given IP address to a message recipient. - View Dependent Claims (13, 14)
-
-
15. A mobile device for managing received messages, comprising:
-
a transceiver to send and receive data over the network; and a processor that is operative to perform actions, including; determining a plurality of IP address groups from a plurality of messages received from the plurality of different IP addresses, the plurality of messages being grouped based on message volumes received during a given time period for each of the different IP addresses; determining an entropy for each IP address group determined to be associated with non-spam messages; receiving a plurality of messages, each message being from a given IP address of a message sender; determining a distribution of messages for the plurality of messages received from the given IP address by distributing the messages from the given IP address across a plurality of message buckets based on a size of each message determining a percentage of messages within each message bucket within the plurality of message buckets by determining a number of messages in a given message bucket divided by a total number of messages for the given IP address; determining a message entropy for the given IP address by summing the percentages of messages times a log of the percentages of messages over all message buckets; selecting an entropy as a threshold from an IP address group associated with non-spam messages based on a comparison of a message volume for the given IP address; if the message entropy indicates that the given IP address is a bulk mailer based on a comparison of the message entropy for the given IP address to the selected entropy threshold, selectively inhibiting messages from the given IP address to be moved to a message inbox; and if the message entropy indicates that the given IP address is a non-bulk mailer, moving the messages to a message inbox. - View Dependent Claims (16, 17)
-
Specification