System and method for detecting phishers by analyzing website referrals
First Claim
1. A computer-implemented method of facilitating detection of phishing-related websites from among a plurality of referring websites, the method comprising:
- accessing a referral list of websites that link to a legitimate website;
calculating statistical outliers within the referral list of websites based on historical patterns to produce a dataset of suspect websites from the referral list of websites;
creating an array of relevant points wherein each relevant point corresponds to a defined HTML tag to construct a referring site fingerprint for each of the suspect websites in the dataset based on content of a suspect website;
comparing each referring site fingerprint to a fingerprint for the legitimate website by determining the number of matches between the array of relevant points and a second array forming the fingerprint for the legitimate website to calculate a relevance score using a percentage match between the referring site fingerprint and the fingerprint for the legitimate website, the relevance score indicating a likelihood that the suspect website is a phishing-related website; and
presenting the relevance score for each of the suspect websites.
1 Assignment
0 Petitions
Accused Products
Abstract
System and method for identifying phishers by analyzing website referrals. A combination of statistical analysis and fingerprinting can be used to assign a relevance score to a referring website that indicates the likelihood that the referring website is a phishing-related website. A fingerprint as used herein with respect to example embodiments is an array of relevant points corresponding to defined HTML tags. The relevance score can be determined at least in part by comparing the fingerprint of a suspect website with that of a base website. The number of matches in relevant points between the two websites determines the relevance score. Provisions can be made for displaying, reporting, and tracking relevance scores so that appropriate actions can be taken as phishing is detected. Additionally, a known-good list of websites can be used to reduce the number of false positives.
59 Citations
18 Claims
-
1. A computer-implemented method of facilitating detection of phishing-related websites from among a plurality of referring websites, the method comprising:
-
accessing a referral list of websites that link to a legitimate website; calculating statistical outliers within the referral list of websites based on historical patterns to produce a dataset of suspect websites from the referral list of websites; creating an array of relevant points wherein each relevant point corresponds to a defined HTML tag to construct a referring site fingerprint for each of the suspect websites in the dataset based on content of a suspect website; comparing each referring site fingerprint to a fingerprint for the legitimate website by determining the number of matches between the array of relevant points and a second array forming the fingerprint for the legitimate website to calculate a relevance score using a percentage match between the referring site fingerprint and the fingerprint for the legitimate website, the relevance score indicating a likelihood that the suspect website is a phishing-related website; and presenting the relevance score for each of the suspect websites. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product including at least one of a magnetic, optical and semiconductor computer-readable storage medium comprising a computer program for facilitating detection of phishing-related websites from among a plurality of referring websites, the computer program further comprising:
-
instructions for accessing a referral list of websites that link to a legitimate website; instructions for calculating statistical outliers within the referral list of websites based on historical patterns to produce a dataset of suspect websites from the referral list of websites; instructions for creating an array of relevant points wherein each relevant point corresponds to a defined HTML tag to construct a referring site fingerprint for a suspect website, the referring site fingerprint based on content of the suspect website; instructions for comparing the referring site fingerprint to a fingerprint for the legitimate website by determining the number of matches between the array of relevant points and a second array forming the fingerprint for the legitimate website to calculate a relevance score using a percentage match between the referring site fingerprint and the fingerprint for the legitimate website, the relevance score indicating a likelihood that the suspect website is a phishing-related website; and instructions for presenting the relevance score for each of the suspect websites. - View Dependent Claims (6, 7, 8)
-
-
9. Apparatus for facilitating detection of phishing-related websites from among a plurality of referring websites, the apparatus comprising:
-
means for accessing a referral list of websites that link to a legitimate website; means for calculating statistical outliers within the referral list of websites based on historical patterns to produce a dataset of suspect websites from the referral list of websites; means for creating an array of relevant points wherein each relevant point corresponds to a defined HTML tag to construct a referring site fingerprint for a suspect website, the referring site fingerprint based on content of the suspect website; means for comparing each referring site fingerprint to a fingerprint for the legitimate website by determining the number of matches between the array of relevant points and a second array forming the fingerprint for the legitimate website to calculate a relevance score using a percentage match between the referring site fingerprint and the fingerprint for the legitimate website, the relevance score indicating a likelihood that the suspect website is a phishing-related website; and means for presenting the relevance score for each of the suspect websites. - View Dependent Claims (10, 11, 12)
-
-
13. A system for facilitating detection of phishing-related websites from among a plurality of referring websites, the system comprising:
-
a data reduction function to access a referral log of websites that link to a legitimate website and to discard known good websites; a data repository to store information on historical patterns of website access; a data qualification function linked to the data repository and to the data reduction function, the data qualification function to compute statistical outliers from the referral log to produce a dataset of suspect websites; and a prioritization and comparison function linked to the data reduction function and the data qualification function to construct a referring site fingerprint for a suspect website and to compare the referring site fingerprint to a fingerprint for the legitimate website by determining the number of matches between the array of relevant points and a second array forming the fingerprint for the legitimate website to calculate a relevance score using a percentage match between the referring site fingerprint and the fingerprint for the legitimate website and to present the relevance score for each of the suspect websites indicating a likelihood that the suspect website is a phishing-related website. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification