Variable sampling rates for website visitation analysis
First Claim
1. A method for sampling a data set comprising a plurality of data items, comprising:
- establishing a target number of sample items for a time period;
establishing a first sampling rate;
applying the first sampling rate to data items corresponding to the time period to obtain an first sample set comprising a plurality of sample items; and
responsive to the first sample set containing a number of sample items for the time period substantially different from the target number, performing the steps of;
establishing a second sampling rate; and
obtaining a second sample set using the second sampling rate.
5 Assignments
0 Petitions
Accused Products
Abstract
A data set containing website traffic data or other data is sampled according to a variable sample rate. A target number of samples per time period is established, and a baseline sample rate is determined. Data items in the data set are sampled according to the baseline sample rate, to obtain a sample set. For time periods where the size of the resulting sample set exceeds the target number of samples, a new sample rate is established and the data items for the time period are resampled. Appropriate sampling capability can thus be provided for website traffic in normal time periods, while maintaining capability for handling spikes and other variations in website traffic as may take place in response to certain periodic or non-periodic events.
78 Citations
49 Claims
-
1. A method for sampling a data set comprising a plurality of data items, comprising:
-
establishing a target number of sample items for a time period;
establishing a first sampling rate;
applying the first sampling rate to data items corresponding to the time period to obtain an first sample set comprising a plurality of sample items; and
responsive to the first sample set containing a number of sample items for the time period substantially different from the target number, performing the steps of;
establishing a second sampling rate; and
obtaining a second sample set using the second sampling rate. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for sampling a data set comprising a plurality of data items, each data item associated with a time period, comprising:
-
establishing a target number of sample items per time period;
establishing a first sampling rate;
applying the first sampling rate to data items corresponding to each time period to obtain, for each time period, a first sample set comprising a plurality of sample items for the time period; and
responsive to the first sample set for a time period containing a number of sample items for the time period substantially different from the target number, performing the steps of;
establishing a second sampling rate for the time period; and
obtaining a second sample set for the time period using the second sampling rate.
-
-
18. A computer program product for sampling a data set comprising a plurality of data items, comprising:
-
a computer-readable medium; and
computer program code, encoded on the medium, for;
establishing a target number of sample items for a time period;
establishing a first sampling rate;
applying the first sampling rate to data items corresponding to the time period to obtain an first sample set comprising a plurality of sample items; and
responsive to the first sample set containing a number of sample items for the time period substantially different from the target number, performing the steps of;
establishing a second sampling rate; and
obtaining a second sample set using the second sampling rate. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A system for sampling a data set comprising a plurality of data items, comprising:
-
a log processing module, for;
establishing a target number of sample items for a time period, establishing a first sampling rate;
applying the first sampling rate to data items corresponding to the time period to obtain an first sample set comprising a plurality of sample items; and
responsive to the first sample set containing a number of sample items for the time period substantially different from the target number, performing the steps of;
establishing a second sampling rate; and
obtaining a second sample set using the second sampling rate; and
a storage device, for storing at least one of the first and second sample sets. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
-
Specification