METHOD AND SYSTEM FOR CHARACTERIZING WEB CONTENT
First Claim
1. A method of processing Web activity data, comprising:
- retrieving a database of clickstream data comprising a user identifier (user ID) and a uniform resource locator (URL) corresponding to a Web page;
truncating the URL to identify a feature of the URL;
building a data structure comprising the user ID and the feature; and
generating segment information from the data structure based on a similarity of a URL visitation pattern across different user IDs, wherein each segment in the segment information comprises one or more of the different user IDs and one or more features.
8 Assignments
0 Petitions
Accused Products
Abstract
An exemplary embodiment of the present invention provides a method of processing Web activity data. The method includes obtaining a database of clickstream data comprising a user identifier corresponding with a user ID and a uniform resource locator (URL) corresponding with a Web page visited from the user ID. The method also includes generating a plurality of features based on the URL. Further, the method includes generating a data structure comprising the user ID and the feature. The method also includes generating segment information from the data structure based on the similarity of a URL visitation pattern across different user IDs, wherein each segment in the segment information comprises one or more user IDs and one or more features.
-
Citations
20 Claims
-
1. A method of processing Web activity data, comprising:
-
retrieving a database of clickstream data comprising a user identifier (user ID) and a uniform resource locator (URL) corresponding to a Web page; truncating the URL to identify a feature of the URL; building a data structure comprising the user ID and the feature; and generating segment information from the data structure based on a similarity of a URL visitation pattern across different user IDs, wherein each segment in the segment information comprises one or more of the different user IDs and one or more features. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer system, comprising:
-
a processor that is adapted to execute machine-readable instructions; a storage device that is adapted to store data, the data comprising a database of clickstream data; and a memory device that stores instructions that are executable by the processor, the instructions comprising; a feature generator adapted to receive a URL from the database of clickstream data and generate one or more features based on the URL; a data structure builder adapted to analyze the clickstream data to identify a user ID and one or more features that correspond with the user ID and to enter the user ID and the one or more features into a data structure; and a segment information generator adapted to process the data structure to generate segments that group user IDs and the one or more features based on a similarity of a visitation pattern. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A tangible, computer-readable medium, comprising:
-
code adapted to receive a URL from a database of clickstream data and generate one or more features based on the URL; code adapted to receive a user ID from the clickstream data and a plurality of features from the feature generator that correspond with the user ID and enter the user ID and features into a data structure; and code adapted to process the data structure to generate groupings of user IDs and features based on a similarity of a visitation pattern. - View Dependent Claims (19, 20)
-
Specification