Classification and cluster analysis spam detection and reduction
First Claim
Patent Images
1. A computer-implemented method for managing email users and email traffic of an email system, comprising:
- collecting usage data of email traffic handled by the email system;
generating time series data from the collected usage data;
analyzing the time series data;
analyzing geographic data of the usage data;
analyzing sending data of the usage data, the sending data comprising information related to the senders of the email traffic;
analyzing content features of the email traffic;
creating a plurality of feature vectors comprising indications of;
the analyzed time series data, analyzed geographic data, analyzed sending data, and analyzed content features,the feature vectors including a first vector comprising an indication of a first feature and a second vector comprising an indication of a second feature; and
performing cluster analysis on the plurality of feature vectors and clustering groups of vectors into a plurality of categories, the categories comprising a first clustered group of users corresponding to a first natural cluster in the usage data and a second clustered group of users corresponding to a second natural cluster in the usage data, the first clustered group comprising at least one first user associated with the first feature, the first feature corresponding to a center of the first clustered group, the second clustered group comprising at least one second user associated with the second feature, and the second feature corresponding to a center of the second clustered group.
9 Assignments
0 Petitions
Accused Products
Abstract
Multiple features of email traffic are analyzed and extracted. Feature vectors comprising the multiple features are created and cluster analysis is utilized to track spam generation even from dynamically changing or aliased IP addresses.
14 Citations
24 Claims
-
1. A computer-implemented method for managing email users and email traffic of an email system, comprising:
-
collecting usage data of email traffic handled by the email system; generating time series data from the collected usage data; analyzing the time series data; analyzing geographic data of the usage data; analyzing sending data of the usage data, the sending data comprising information related to the senders of the email traffic; analyzing content features of the email traffic; creating a plurality of feature vectors comprising indications of;
the analyzed time series data, analyzed geographic data, analyzed sending data, and analyzed content features,the feature vectors including a first vector comprising an indication of a first feature and a second vector comprising an indication of a second feature; and performing cluster analysis on the plurality of feature vectors and clustering groups of vectors into a plurality of categories, the categories comprising a first clustered group of users corresponding to a first natural cluster in the usage data and a second clustered group of users corresponding to a second natural cluster in the usage data, the first clustered group comprising at least one first user associated with the first feature, the first feature corresponding to a center of the first clustered group, the second clustered group comprising at least one second user associated with the second feature, and the second feature corresponding to a center of the second clustered group. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method for managing email users and email traffic of an email system, comprising:
-
collecting usage data of individual and aggregate usage of the email system; generating time series data from the collected data; analyzing the time series data; analyzing geographic data of the individual and aggregate usage; analyzing sending data of the individual and aggregate usage; analyzing content features of the individual and aggregate usage; creating a plurality of feature vectors comprising indications of;
the analyzed time series data, analyzed geographic data; and
analyzed content features;performing cluster analysis on the plurality of feature vectors and clustering groups of vectors into categories; assigning a set of permissions and policies to each of the categories of clustered groups, wherein each category is assigned a set of permissions and policies and wherein there are two or more sets of permissions and policies; and applying a first set of permissions and policies to a first clustered group of users and applying a second set of permissions and policies to a second clustered group of users. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An email delivery and management system, configured to:
-
collect usage data of email traffic handled by the email system; generate time series data from the collected usage data; analyze the time series data; analyze geographic data of the usage data; analyze sending data of the usage data, the sending data comprising information related to the senders of the email traffic; analyze content features of the email traffic; create a plurality of feature vectors comprising indications of;
the analyzed time series data, analyzed geographic data, analyzed sending data, and analyzed content features;
the feature vectors including a first vector comprising an indication of a first feature and a second vector comprising an indication of a second feature; andperform cluster analysis on the plurality of feature vectors and cluster groups of vectors into a plurality of categories, the categories comprising a first clustered group of users corresponding to a first natural cluster in the usage data and a second clustered group of users corresponding to a second natural cluster in the usage data, the first clustered group comprising at least one first user associated with the first feature, the first feature corresponding to a center of the first clustered group, the second clustered group comprising at least one second user associated with the second feature, and the second feature corresponding to a center of the second clustered group. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification