COLLECTING AND AGGREGATING LOG DATA WITH FAULT TOLERANCE
First Claim
1. A system for collecting and aggregating log data with fault tolerance, the method, comprising:
- one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data;
wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch;
the agent node further computes a checksum for the batch of multiple messages;
a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data;
wherein, the collector determines the checksum for the batch of multiple messages received from the agent node.
5 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of collecting and aggregating log data with fault tolerance are disclosed. One embodiment includes, one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data, wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch. In one embodiment, the agent node further computes a checksum for the batch of multiple messages. The system may further include a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the batch of multiple messages received from the agent node.
-
Citations
21 Claims
-
1. A system for collecting and aggregating log data with fault tolerance, the method, comprising:
-
one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data; wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch; the agent node further computes a checksum for the batch of multiple messages; a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the batch of multiple messages received from the agent node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for collecting and aggregating datasets for storage in a file system with fault tolerance, the method, comprising:
-
collecting datasets from a data source on a machine where the datasets are generated; generating, a batch comprising multiple messages from the datasets; assigning a tag to the batch and computing a checksum for the batch; writing the tag, the batch of multiple messages, and the checksum to an entry in a write-ahead-log (WAL) in storage; sending the datasets to a receiving location; in response to verifying the checksum of the batch of multiple messages at the receiving location, adding the tag to a map; writing a file to destination location; identifying, in the file, tags associated with the batches in the file that have been written to the destination location. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A method for collecting and aggregating datasets with fault tolerance using a store on failure mechanism, the method, comprising:
-
collecting a dataset from a data source on a machine where the dataset is generated; wherein, the dataset is collected by an agent node executed on the machine; sending the dataset to receiving location which aggregates the dataset; in response to determining that receiving location which is mapped to receive the dataset has failed, storing, by the agent node, the dataset in persistent storage of the machine until the receiving location has been repaired or until another destination is identified. - View Dependent Claims (20, 21)
-
Specification