Load-balancing and scaling for analytics data
First Claim
Patent Images
1. A method for providing data distribution, the method comprising:
- receiving at least one element of data;
selecting one of a plurality of database servers, wherein each of the plurality of database servers is associated with a sequence number, according to an identifier associated with the at least one element of data wherein selecting the one of the plurality of database servers comprises computing a database sequence number by modulating an identifier associated with the at least one element of data by a total number of the plurality of database servers;
storing the at least one element of data in the selected database server;
processing a plurality of data elements stored in at least one of the plurality of database servers, wherein processing the plurality of data elements comprises;
dividing the plurality of data elements into at least one batch,copying the at least one batch of data elements to an analytics server, andanalyzing the at least one batch of data elements according to an insight model;
determining whether an aggregation scope associated with the processed plurality of data elements is not assigned to at least one of a plurality of reporting databases; and
in response to determining that the aggregation scope is not assigned to at least one of a plurality of reporting database servers;
assigning the aggregation scope to at least one of the plurality of reporting database servers according to a potential data volume of the aggregation scope and a data capacity of each of the plurality of reporting database servers, andaggregating the processed plurality of data elements into the at least one of the plurality of reporting database servers.
2 Assignments
0 Petitions
Accused Products
Abstract
Load-balancing and scaling for analytics data may be provided. A logging system may receive data and select a stager database in which to store the data. The selection may be made according to an identifier associated with the data. The stored data may be processed and stored back to the stager database before being copied to a reporting database. The processed data may be aggregated with other data in the reporting database to provide an analytics report.
-
Citations
19 Claims
-
1. A method for providing data distribution, the method comprising:
-
receiving at least one element of data; selecting one of a plurality of database servers, wherein each of the plurality of database servers is associated with a sequence number, according to an identifier associated with the at least one element of data wherein selecting the one of the plurality of database servers comprises computing a database sequence number by modulating an identifier associated with the at least one element of data by a total number of the plurality of database servers; storing the at least one element of data in the selected database server; processing a plurality of data elements stored in at least one of the plurality of database servers, wherein processing the plurality of data elements comprises; dividing the plurality of data elements into at least one batch, copying the at least one batch of data elements to an analytics server, and analyzing the at least one batch of data elements according to an insight model; determining whether an aggregation scope associated with the processed plurality of data elements is not assigned to at least one of a plurality of reporting databases; and in response to determining that the aggregation scope is not assigned to at least one of a plurality of reporting database servers; assigning the aggregation scope to at least one of the plurality of reporting database servers according to a potential data volume of the aggregation scope and a data capacity of each of the plurality of reporting database servers, and aggregating the processed plurality of data elements into the at least one of the plurality of reporting database servers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for load-balancing analytics data comprising:
-
capturing a user behavior associated with a web site, wherein the user behavior is associated with a user behavior identifier; selecting a raw data database from among a plurality of raw data databases, wherein selecting the raw data database comprises; assigning a sequence number to each of the plurality of raw data databases, calculating a database identifier from the user behavior identifier modulated by a total number of the plurality of raw databases, and selecting the sequence number associated with the database identifier; storing the captured user behavior in the selected raw data database; processing the captured user behavior, wherein processing the captured behavior comprises; dividing a plurality of captured behaviors into at least one batch, copying the at least one batch of captured behaviors to an analytics server, and analyzing the at least one batch of captured behaviors according to an insight model; storing the processed user behavior in the selected raw data database; determining whether any of a plurality of aggregation scopes are not assigned to at least one of a plurality of reporting databases; and in response to determining that at least one of the plurality of aggregation scopes are not assigned to at least one of a plurality of reporting databases, assign the at least one unassigned scope to at least one of the plurality of reporting databases according to a potential data volume of the at least one unassigned scope and a data capacity of each of the plurality of reporting databases. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system for providing load-balancing and scaling for data distribution, the system comprising:
-
a memory storage; and a processing unit coupled to the memory storage, wherein the processing unit is configured to; capture a user behavior associated with a web site, wherein the user behavior is associated with an identifier comprising at least one of the following;
a user ID and a browser session ID;select a raw database from among a plurality of raw databases, wherein being configured to select the raw database comprises being configured to; assign a sequence number to each of the plurality of raw databases, calculate a database identifier from the user behavior identifier modulated by a total number of the plurality of raw databases, and select the sequence number associated with the database identifier; store the captured user behavior in the selected raw database; process the captured user behavior, wherein being configured to process the user behavior comprises being configured to; divide a plurality of captured user behaviors into a batch, copy the batch of captured user behaviors to an analytics server, and analyze the batch of captured user behaviors according to a behavior insight model; store the processed user behavior in the selected raw data database; determine whether any of a plurality of aggregation scopes are not assigned to at least one of a plurality of reporting databases; in response to determining that at least one of the plurality of aggregation scopes are not assigned to at least one of a plurality of reporting databases, assign the at least one unassigned scope to at least one of the plurality of reporting databases according to a potential data volume of the at least one unassigned scope and a data capacity of each of the plurality of reporting databases; copy the processed user behavior to one of the plurality of reporting databases according to the scope assigned to the one of the reporting databases; and aggregate the copied user behavior with at least one other copied user behavior of the same type into a usage report wherein the user behavior type comprises at least one of the following;
a click-through, a search, a unique visitor count, and a dwell time.
-
Specification