System for determining degrees of similarity in email message information
First Claim
1. A method for classifying email messages, the method comprising using a plurality of modules to determine a level of sameness of a particular email message with one or more prior email messages, wherein the level of sameness is derived for the particular email message from a weighting of the outputs of the modules;
- determining a performance level for each of the modules;
comparing performance levels;
adjusting a weighting of at least one module in response to comparing performance levels; and
using the level of sameness for the particular email message to classify the particular email message into a category.
1 Assignment
0 Petitions
Accused Products
Abstract
Similarity of email message characteristics is used to detect bulk and spam email. A determination of “sameness” for purposes of both bulk and spam classifications can use any number and type of evaluation modules. Each module can include one or more rules, tests, processes, algorithms, or other functionality. For example, one type of module may be a word count of email message text. Another module can use a weighting factor based on groups of multiple words and their perceived meanings. In general, any type of module that performs a similarity analysis can be used. A preferred embodiment of the invention uses statistical analysis, such as Bayesian analysis, to measure the performance of different modules against a known standard, such as human manual matching. Modules that are performing worse than other modules can be valued less than modules having better performance. In this manner, a high degree of reliability can be achieved. To improve performance, if a message is determined to be the same as a previous message, the previous computations and results for that previous message can be re-used. Users can be provided with options to customize or regulate bulk and spam classification and subsequent actions on how to handle the classified email messages.
70 Citations
24 Claims
-
1. A method for classifying email messages, the method comprising
using a plurality of modules to determine a level of sameness of a particular email message with one or more prior email messages, wherein the level of sameness is derived for the particular email message from a weighting of the outputs of the modules; -
determining a performance level for each of the modules;
comparing performance levels;
adjusting a weighting of at least one module in response to comparing performance levels; and
using the level of sameness for the particular email message to classify the particular email message into a category. - View Dependent Claims (2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. An apparatus for classifying email messages, the apparatus comprising
a processor for executing instructions included in a machine-readable medium, the machine-readable medium including one or more instructions for using a plurality of modules to determine a level of sameness of a particular email message with one or more prior email messages, wherein the level of sameness is derived for the particular email message from a weighting of the outputs of the modules; -
one or more instructions for determining a performance level for each of the modules;
one or more instructions for comparing performance levels;
one or more instructions for adjusting a weighting of at least one module in response to comparing performance levels; and
one or more instructions for using the level of sameness for the particular email message to classify the particular email message into a category.
-
-
23. A machine-readable medium including instructions executable by a processor for classifying email messages, the machine-readable medium including
one or more instructions for using a plurality of modules to determine a level of sameness of a particular email message with one or more prior email messages, wherein the level of sameness is derived for the particular email message from a weighting of the outputs of the modules; -
one or more instructions for determining a performance level for each of the modules;
one or more instructions for comparing performance levels;
one or more instructions for adjusting a weighting of at least one module in response to comparing performance levels; and
one or more instructions for using the level of sameness for the particular email message to classify the particular email message into a category.
-
-
24. An apparatus for classifying email messages, the apparatus comprising
means for using a plurality of modules to determine a level of sameness of a particular email message with one or more prior email messages, wherein the level of sameness is derived for the particular email message from a weighting of the outputs of the modules; -
means for determining a performance level for each of the modules;
means for comparing performance levels;
means for adjusting a weighting of at least one module in response to comparing performance levels; and
means for using the level of sameness for the particular email message to classify the particular email message into a category.
-
Specification