Apparatus and methods for classification of web sites
First Claim
1. A computer program product in a computer readable medium for characterizing a web site, comprising:
- first instructions for receiving traffic data for the web site;
second instructions for matching the traffic data to a matching traffic template of a plurality of predefined traffic templates; and
third instructions for characterizing the web site based on characteristics of the matching traffic template.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus and methods for classifying web sites are provided. With the apparatus and methods, traffic data is obtained for a plurality of web sites. This patterns, or templates, for each web site are generated based on this traffic data and the patterns are clustered into classes of web sites using a clustering algorithm. The clusters, or classes, are then profiled to generate a template for each class. The template for each class is generated by first shifting the patterns for each web site that is part of the class to compensate for effects like time zone differences, if any, and then identifying a pattern that is most similar to all of the patterns in the class. Once the template for each class is generated, this template is then used with traffic data from a new web site to classify the new web site into one of the existing classes. In other words, when traffic data for a new web site is received, a pattern for the traffic data of the new web site is generated and compared to the templates for the various classes. If a matching class template is identified, the new web site is classified into the corresponding class. If the pattern for the new web site does not match any of the existing templates, a new template and class may be generated based on the pattern for the new web site.
-
Citations
41 Claims
-
1. A computer program product in a computer readable medium for characterizing a web site, comprising:
-
first instructions for receiving traffic data for the web site;
second instructions for matching the traffic data to a matching traffic template of a plurality of predefined traffic templates; and
third instructions for characterizing the web site based on characteristics of the matching traffic template. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of characterizing a web site, comprising:
-
receiving traffic data for the web site;
matching the traffic data to a matching traffic template of a plurality of predefined traffic templates; and
characterizing the web site based on characteristics of the matching traffic template. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. An apparatus for characterizing a web site, comprising:
-
means for receiving traffic data, for the web site;
means for matching the traffic data to a matching traffic template of a plurality of predefined traffic templates; and
means for characterizing the web site based on characteristics of the matching traffic template. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
40. A method for determining characteristics of one or more web sites based on traffic data, comprising:
-
receiving traffic data for a plurality of web sites;
clustering the plurality of web sites into two or more clusters based on the traffic data;
profiling the two or more clusters based on traffic data for member web sites of each cluster, to generate a traffic template for each cluster;
identifying characteristics of the member web sites of each cluster; and
identifying characteristics of the cluster based on the characteristics of the member web sites of the cluster.
-
-
41. A method for deploying computing infrastructure, comprising integrating computer readable code into a computing system, wherein the code in combination with the computing system is capable of performing the following:
-
receiving traffic data for the web site;
matching the traffic data to a matching traffic template of a plurality of predefined traffic templates; and
characterizing the web site based on characteristics of the matching traffic template.
-
Specification