Multi-level data staging for low latency data access
First Claim
1. A method, comprising:
- producing, at a plurality of front end servers, log data based on real-time user activities;
in an event that an aggregating server is unavailable, staging the log data at a front end staging area in at least one of the plurality of front end servers for providing a back end server real-time access to the log data;
in an event that the aggregating server is available, transmitting the log data to the aggregating server;
aggregating the log data at the aggregating server;
staging the log data at the aggregating server and providing a back end server with real-time access to the log data;
after providing the back end server real-time access to the log data, sending the log data from the aggregating server to a data warehouse; and
processing the log data at the data warehouse so that the data warehouse can respond to a data query based on the processed log data.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.
-
Citations
17 Claims
-
1. A method, comprising:
-
producing, at a plurality of front end servers, log data based on real-time user activities; in an event that an aggregating server is unavailable, staging the log data at a front end staging area in at least one of the plurality of front end servers for providing a back end server real-time access to the log data; in an event that the aggregating server is available, transmitting the log data to the aggregating server; aggregating the log data at the aggregating server; staging the log data at the aggregating server and providing a back end server with real-time access to the log data; after providing the back end server real-time access to the log data, sending the log data from the aggregating server to a data warehouse; and processing the log data at the data warehouse so that the data warehouse can respond to a data query based on the processed log data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented system, comprising:
-
a plurality of front end servers configured to produce log data based on real-time user activities; at least one aggregating server configured to aggregate the log data received from at least some of the front end servers, the aggregating server being connected with at least some of the front end servers via a network;
wherein the aggregating server includes a data staging area configured to stage the log data at the aggregating server and providing a back end server with real-time access to the log data;a data warehouse, wherein the aggregating server is further configured to periodically send the log data to the data warehouse after providing the back end server real-time access to the log data, and the data warehouse is configured to process the log data and to respond a data query based on the processed log data; and at least one second level aggregating server configured for aggregating the log data received from the aggregating server, the second level aggregating server being connected with the aggregating server, wherein the second level aggregating server includes a second level data staging area configured for staging the log data so that the back end server can access the log data in real time. - View Dependent Claims (13, 14, 15)
-
-
16. An aggregating server, comprising:
-
a processor; a network interface, coupled to the processor, through which the aggregating server can communicate with a plurality of front end servers; a data storage including a data staging area; a memory storing instructions which, when executed by the processor, cause the aggregating server to perform a process including; receiving log data from the front end servers, wherein the front end servers produce the log data based on real-time user activities, aggregating the log data, and staging the log data at the data staging area of the aggregating server and providing a back end server with real-time access to the log data; splitting the log data into multiple log data streams based on hash values calculated based on entries of the log data; and after providing the back end server real-time access to the log data, sending the log data from the aggregating server to the data warehouse, wherein the data warehouse processes the log data and responds to data queries based on the processed log data. - View Dependent Claims (17)
-
Specification