Anti-spam tool for browser
First Claim
1. A method for resisting spam webpages on a computing device installed with a web browser, the method comprising:
- receiving at the web browser a URL of a webpage;
determining by a spam detection module installed on the computing device whether the webpage is spam by comparing the URL of the webpage with a spam list including spam URLs, the spam list being created by;
dividing the spam URLs of the spam list into a plurality of sub chunks of spam URLs;
indexing the spam URLs into a first level index and a second level index, the first level index maps a first set of hash values to ranges of sub chunks of spam URLs, and the second level index maps a second set of hash values to the remaining sub chunks of spam URLs in the plurality of sub chunks;
the first set of hash values are created using a first hash function and the second set of hash values are created using a second hash function; and
performing an anti-spam action on the computing device if the webpage is determined to be spam;
wherein comparing the URL of the webpage with the spam comprises;
computing a hash value for the URL of the webpage using a hash function; and
matching the hash value of the webpage with the set of hash values of the spam URLs;
wherein the spam list is further created by;
computing the first set of hash values and the second set of hash values;
sorting the spam URLs by their computed hash values;
wherein each sub chunk having a sequential range of hash values defined by a lower bound and an upper bound.
2 Assignments
0 Petitions
Accused Products
Abstract
An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool'"'"'s use of the spam list.
13 Citations
16 Claims
-
1. A method for resisting spam webpages on a computing device installed with a web browser, the method comprising:
-
receiving at the web browser a URL of a webpage; determining by a spam detection module installed on the computing device whether the webpage is spam by comparing the URL of the webpage with a spam list including spam URLs, the spam list being created by; dividing the spam URLs of the spam list into a plurality of sub chunks of spam URLs; indexing the spam URLs into a first level index and a second level index, the first level index maps a first set of hash values to ranges of sub chunks of spam URLs, and the second level index maps a second set of hash values to the remaining sub chunks of spam URLs in the plurality of sub chunks; the first set of hash values are created using a first hash function and the second set of hash values are created using a second hash function; and performing an anti-spam action on the computing device if the webpage is determined to be spam; wherein comparing the URL of the webpage with the spam comprises; computing a hash value for the URL of the webpage using a hash function; and matching the hash value of the webpage with the set of hash values of the spam URLs; wherein the spam list is further created by; computing the first set of hash values and the second set of hash values; sorting the spam URLs by their computed hash values; wherein each sub chunk having a sequential range of hash values defined by a lower bound and an upper bound. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An anti-spam tool executable by a processor, wherein the anti-spam tool is embodied on computer-readable memory and co-installed with a web browser on a computing device, and interfaces with the web browser through a program interface, the anti-spam tool comprising:
-
a spam detection module determines whether a target webpage associated with a URL is spam; receiving at the web browser a URL of a webpaqe; determining by a spam detection module installed on the computing device whether the webpaqe is spam by comparing the URL of the webpaqe with a spam list including spam URLs, the spam list being created by; dividing the spam URLs of the spam list into a plurality of sub chunks of spam URLs; indexing the spam URLs into a first level index and a second level index, the first level index maps a first set of hash values to ranges of sub chunks of spam URLs, and the second level index maps a second set of hash values to the remaining sub chunks of spam URLs in the plurality of sub chunks; the first set of hash values are created using a first hash function and the second set of hash values are created using a second hash function; and performing an anti-spam action on the computing device if the webpage is determined to be spam; wherein comparing the URL of the webpage with the spam comprises; computing a hash value for the URL of the webpage using a hash function; and matching the hash value of the webpage with the set of hash values of the spam URLs; wherein the spam list is further created by; computing the first set of hash values and the second set of hash values; sorting the spam URLs by their computed hash values; wherein each sub chunk having a sequential range of hash values defined by a lower bound and an upper bound; a spam list indexer comprising an index of a spam list, the index including a plurality of groups that each include a plurality of spam URLs, wherein at least one group is assigned a hash value; and an anti-spam controller assist performs an anti-spam action if the target webpage is determined to be a spam. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. One or more computer readable memory having stored thereupon a plurality of instructions that, when executed by a processor, causes the processor to implement the instructions for a method comprising:
-
receiving at the web browser a URL of a webpaqe; determining by a spam detection module installed on the computing device whether the webpaqe is spam by comparing the URL of the webpaqe with a spam list including spam URLs, the spam list being created by; dividing the spam URLs of the spam list into a plurality of sub chunks of spam URLs; indexing the spam URLs into a first level index and a second level index, the first level index maps a first set of hash values to ranges of sub chunks of spam URLs, and the second level index maps a second set of hash values to the remaining sub chunks of spam URLs in the plurality of sub chunks; the first set of hash values are created using a first hash function and the second set of hash values are created using a second hash function; and performing an anti-spam action on the computing device if the webpaqe is determined to be spam; wherein comparing the URL of the webpaqe with the spam comprises; computing a hash value for the URL of the webpaqe using a hash function; and matching the hash value of the webpage with the set of hash values of the spam URLs; wherein the spam list is further created by; computing the first set of hash values and the second set of hash values; sorting the spam URLs by their computed hash values; wherein each sub chunk having a sequential range of hash values defined by a lower bound and an upper bound; dividing a plurality of spam uniform resource locators (URLs) into a plurality of sub chunks of spam URLs; indexing the spam URLs into a first level index and a second level index, the first level index maps a first set of hash values to ranges of sub chunks of spam URLs, and the second level index maps a second set of hash values to the remaining sub chunks of spam URLs in the plurality of sub chunks; matching the hash value of the URL of the webpage with the hash values of the spam URLs of the spam list through the first level index followed by the second level index.
-
Specification