Origination/destination features and lists for spam prevention
First Claim
1. A system implemented on one or more computers that facilitates extracting data in connection with spam processing, comprising:
- a component implemented on one or more processors that receives an item and extracts a set of features associated with an origination of a message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message, wherein the set of features comprises a host name and a domain name;
and a component that employs a subset of the extracted features in connection with building a filter, wherein the filter is at least one of stored on a computer readable storage medium, displayed on a display device, or employed by a component executing on one or more processors.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message call be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.
-
Citations
50 Claims
-
1. A system implemented on one or more computers that facilitates extracting data in connection with spam processing, comprising:
-
a component implemented on one or more processors that receives an item and extracts a set of features associated with an origination of a message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message, wherein the set of features comprises a host name and a domain name; and a component that employs a subset of the extracted features in connection with building a filter, wherein the filter is at least one of stored on a computer readable storage medium, displayed on a display device, or employed by a component executing on one or more processors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A method that facilitates extracting data in connection with spam processing, comprising:
-
receiving a message; extracting a set of features associated with an origination of the message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message, wherein the set of features comprises at least a portion of an IP address;
wherein extracting at least a portion of the IP address comprises performing at least one of consulting a block ID directory to determine at least one block ID corresponding to the IP address such that the block ID is extracted as an additional feature or extracting each of at least a first 1 bit up to a first 31 bits from the IP address; andemploying a subset of the extracted features in connection with building a filter. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
-
-
50. A system that facilitates extracting data in connection with spam processing, comprising:
-
a means for receiving a message; a means for extracting a set of features associated with an origination of the message or part thereof and/or information that enables an intended recipient to contact, respond or receive in connection with the message, wherein the set of features comprises a host name and a domain name; and a means for employing a subset of the extracted features in connection with building a filter.
-
Specification