Categorizing web sites based on content-temporal locality
First Claim
1. A method of categorizing a web site on a network, comprising:
- using a computer to perform steps comprising;
receiving a web site browsing stream describing temporal proximities of web sites browsed by a client, wherein a temporal proximity indicates a time interval between visits to browsed web sites;
identifying a category of a first web site in the web site browsing stream;
determining a temporal proximity of a second web site relative to the first web site in the web site browsing stream;
determining a probability that the second web site is of a same category as the first web site in the web site responsive at least in part to the temporal proximity of the second web site relative to the first web site and a degree of content-temporal locality for the category of the first web site; and
storing the determined category probability for the second web site.
2 Assignments
0 Petitions
Accused Products
Abstract
A categorization server is coupled to multiple clients via a network. Each client has a security module that monitors web browsing performed on the client and reports a web site browsing stream to the categorization server. The categorization server identifies a site from the browsing stream of a client that is of a known category. The categorization server uses content-temporal locality to determine whether other sites in the browsing stream belong to the same category as the site having the known category. This determination can be performed by assigning probabilities to other sites in the browsing stream, and by considering probabilities assigned to sites in browsing streams of other clients. The categorization server provides categories of sites to the clients, and the client security modules can implement category-based security policies.
13 Citations
20 Claims
-
1. A method of categorizing a web site on a network, comprising:
using a computer to perform steps comprising; receiving a web site browsing stream describing temporal proximities of web sites browsed by a client, wherein a temporal proximity indicates a time interval between visits to browsed web sites; identifying a category of a first web site in the web site browsing stream; determining a temporal proximity of a second web site relative to the first web site in the web site browsing stream; determining a probability that the second web site is of a same category as the first web site in the web site responsive at least in part to the temporal proximity of the second web site relative to the first web site and a degree of content-temporal locality for the category of the first web site; and storing the determined category probability for the second web site. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A computer-implemented system for categorizing a web site on a network, comprising:
-
a computer processor; and a computer-readable storage medium storing computer program modules configured to execute on the computer processor, the computer program modules comprising; a communication module configured to receive a web site browsing stream describing temporal proximities of web sites browsed by a client, wherein a temporal proximity indicates a time interval between visits to browsed web sites; a web site categorization module configured to; identify a category of a first web site in the web site browsing stream; determine a temporal proximity of a second web site relative to the first web site in the web site browsing stream; and determine a probability that the second web site is of a same category as the first web site responsive at least in part to the temporal proximity of the second web site relative to the first web site and a degree of content-temporal locality for the category of the first web site; and a data store module configured to store the determined category probability for the second web site. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium storing computer program modules for categorizing a web site on a network, the modules comprising:
-
a communication module configured to receive a web site browsing stream describing temporal proximities of web sites browsed by a client, wherein a temporal proximity indicates a time interval between visits to browsed web sites; a web site categorization module configured to; identify a category of a first web site in the web site browsing stream; determine a temporal proximity of a second web site relative to the first web site in the web site browsing stream; and determine a probability that the second web site is of a same category as the first web site responsive at least in part to the temporal proximity of the second web site relative to the first web site and a degree of content-temporal locality for the category of the first web site; and a data store module configured to store the determined probability for the second web site. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification