Incremental anti-spam lookup and update service
First Claim
1. A computer-implemented anti-spam update system comprising following components stored in a computer memory:
- a spam filter trained to distinguish between spam and good messages;
an update component that incrementally augments or replaces at least a portion of the spam filter with updated information to facilitate spam prevention, the update component is built at least in part by using a machine learning component; and
the machine learning component trains a first new filter using data extracted from one or more newly received messages, determines differences between the first new filter and the spam filter that satisfy a threshold value, trains a second new filter constrained to maintain weights of features of the spam filter corresponding to differences between the first new filter and the spam filter that did not satisfy the threshold value, and determines differences between the spam filter and the second new filter which satisfy one or more thresholds for augmenting or replacing at least a portion of the spam filter, wherein the updated information utilized by the update component is based at least in part on the determined differences between the spam filter and the second new filter.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a unique system and method that facilitates incrementally updating spam filters in near real time or real time. Incremental updates can be generated in part by difference learning. Difference learning involves training a new spam filter based on new data and then looking for the differences between the new spam filter and the existing spam filter. Differences can be determined at least in part by comparing the absolute values of parameter changes (weight changes of a feature between the two filters). Other factors such as frequency of parameters can be employed as well. In addition, available updates with respect to particular features or messages can be looked up using one or more lookup tables or databases. When incremental and/or feature-specific updates are available, they can be downloaded such as by a client for example. Incremental updates can be automatically provided or can be provided by request according to client or server preferences.
196 Citations
41 Claims
-
1. A computer-implemented anti-spam update system comprising following components stored in a computer memory:
-
a spam filter trained to distinguish between spam and good messages; an update component that incrementally augments or replaces at least a portion of the spam filter with updated information to facilitate spam prevention, the update component is built at least in part by using a machine learning component; and the machine learning component trains a first new filter using data extracted from one or more newly received messages, determines differences between the first new filter and the spam filter that satisfy a threshold value, trains a second new filter constrained to maintain weights of features of the spam filter corresponding to differences between the first new filter and the spam filter that did not satisfy the threshold value, and determines differences between the spam filter and the second new filter which satisfy one or more thresholds for augmenting or replacing at least a portion of the spam filter, wherein the updated information utilized by the update component is based at least in part on the determined differences between the spam filter and the second new filter. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-implemented anti-spam query system comprising following components stored in a computer memory:
-
a machine learning spam filter trained to distinguish between spam and good messages; a lookup component that receives queries for feature-related information as a message arrives to facilitate updating the spam filter, the lookup component is built at least in part by using a lookup database; and the lookup database trains a first new filter using one or more features extracted from one or more recently received messages, determines differences between the first new filter and the spam filter that satisfy a threshold value, trains a second new filter constrained to maintain weights of features of the spam filter that did not satisfy the threshold value, and determines differences between the spam filter and the second new filter which satisfy one or more thresholds for augmenting or replacing at least a portion of the spam filter, wherein the lookup component updates the spam filter based at least in part on the determined differences between the spam filter and the second new filter. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A computer-implemented anti-spam update method comprising the following operations to build filters which minimize a number of differences:
-
providing an existing trained spam filter stored in a computer memory; discriminatively training a first new spam filter using machine learning and data from one or more new messages; determining a first set of differences between the existing spam filter and the first new spam filter that satisfy a threshold or heuristic; training a second new spam filter using the new message data subject to a constraint that parameter changes between the first new spam filter and the existing spam filter that did not satisfy the threshold or heuristic have same value in the second new filter and the existing filter; determining a second set of differences between the second new spam filter and the existing spam filter; and incrementally updating the existing spam filter with at least a subset of the second set of the differences. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A computer-implemented anti-spam update system comprising following components stored in a computer memory:
-
means for providing an existing trained spam filter; means for discriminatively training a first new spam filter using machine learning and new data; means for determining a first set of differences between the existing spam filter and the first new spam filter that satisfy a threshold or heuristic; means for training a second new spam filter using the new data subject to a constraint that parameter changes between the first new spam filter and the existing spam filter that did not satisfy the threshold or heuristic have same values in the second filter as they do in the existing filter; means for determining a second set of differences between the second new spam filter and the existing spam filter; and means for incrementally updating the existing spam filter with at least a subset of the second set of the differences. - View Dependent Claims (40)
-
-
41. A computer readable medium having stored thereon computer-executable code for facilitating incremental updates to spam filters comprising:
-
code for obtaining information associated with a first set of differences resulting from comparing an existing spam filter to a first newly trained spam filter; code for comparing absolute values of the first set of differences to one or more threshold values; code for training a second newly trained spam filter such that elements of the second newly trained spam filter respectively corresponding to differences in the first set of differences having absolute values less than the threshold values are unchanged from the existing filter; code for obtaining information regarding a second set of differences associated with comparing the existing filter and the second newly trained filter; and code for updating the existing filter based on the second set of differences.
-
Specification