Automatically verifying that anti-phishing URL signatures do not fire on legitimate web sites
First Claim
1. A method implemented in a computer system for detecting false positives among a plurality of search patterns of web sites that include illegitimate content comprising:
- accessing a first page of a legitimate web site;
obtaining all links included in the first page;
for each link included in the first page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns;
for each link that matches the search pattern, indicating that the search pattern is a false positive; and
for each link that points to a page on the web site, recursively;
accessing a page pointed to by the link;
obtaining all links in the page;
for each link included in the page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns; and
for each link that matches the search pattern, indicating that the search pattern is a false positive;
wherein each search pattern is a regular expression.
10 Assignments
0 Petitions
Accused Products
Abstract
A method and computer program product prevent false positives from occurring by reducing or preventing legitimate web site content from triggering matches to phishing black lists, but provides time and cost savings over manual review of black lists. A method implemented in a computer system for detecting false positives among a plurality of search patterns of web sites that include illegitimate content comprises accessing a first page of a legitimate web site, obtaining all links included in the first page, for each link included in the first page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns, and for each link that matches the search pattern, indicating that the search pattern is a false positive.
82 Citations
19 Claims
-
1. A method implemented in a computer system for detecting false positives among a plurality of search patterns of web sites that include illegitimate content comprising:
-
accessing a first page of a legitimate web site; obtaining all links included in the first page; for each link included in the first page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns; for each link that matches the search pattern, indicating that the search pattern is a false positive; and for each link that points to a page on the web site, recursively; accessing a page pointed to by the link; obtaining all links in the page; for each link included in the page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns; and for each link that matches the search pattern, indicating that the search pattern is a false positive; wherein each search pattern is a regular expression. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product for detecting false positives among a plurality of search patterns of web sites that include illegitimate content comprising:
-
a computer readable storage medium; computer program instructions, recorded on the computer readable storage medium, executable by a processor, for performing the steps of; accessing a first page of a web site; obtaining all links included in the first page; for each link pointing to a page on the web site, determining whether the link matches at least one search pattern; for each link that matches the search pattern, indicating that the search pattern is a false positive; and for each link that points to a page on the web site, recursively; accessing a page pointed to by the link; obtaining all links in the page; for each link included in the page that points to a page on the web site, determining whether the link matches at least one of the plurality of search patterns; and for each link that matches the search pattern, indicating that the search pattern is a false positive; wherein each search pattern is a regular expression. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
Specification