ORIGINATION/DESTINATION FEATURES AND LISTS FOR SPAM PREVENTION
First Claim
1. A system that facilitates extracting data in connection with spam processing, comprising:
- a component that receives an item and extracts a set of features associated with an origination of a message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message;
and a component that employs a subset of the extracted features in connection with building a filter, wherein the filter determines a probability that the message is spam.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message can be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.
-
Citations
20 Claims
-
1. A system that facilitates extracting data in connection with spam processing, comprising:
-
a component that receives an item and extracts a set of features associated with an origination of a message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message;
and a component that employs a subset of the extracted features in connection with building a filter, wherein the filter determines a probability that the message is spam. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method that facilitates extracting data in connection with spam processing, comprising:
-
receiving a message;
extracting a set of features associated with an origination of the message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message; and
employing a subset of the extracted features in connection with building a filter, wherein the filter determines a probability of the message being spam. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A system that facilitates extracting data in connection with spam processing, comprising:
-
a means for receiving a message;
a means for extracting a set of features associated with an origination of the message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message; and
a means for employing a subset of the extracted features in connection with building a filter, wherein the filter determines a probability of the message being spam.
-
Specification