Collecting and aggregating log data with fault tolerance
First Claim
1. A system for collecting and aggregating log data with fault tolerance, the system comprising:
- one or more devices that generate log data;
one or more machines each having an agent node to collect the log data,wherein, at least one agent node;
generates a batch comprising multiple messages from the log data,assigns a tag to the batch, andcomputes a checksum for the batch of multiple messages;
a collector device associated with a collector tier and having a collector node,wherein the at least one agent node sends the batch of multiple messages as a single aggregate event, the checksum and the tag to the collector node, andwherein, the collector node verifies the checksum for the batch of multiple messages received from the at least one agent node,at least one master in communication with the one or more machines and the collector device,wherein, the at least one master, using tags that are assigned to batches, acknowledges that the collector node has written the batches into a file system and informs the at least one agent node that the batches have been safely stored in the file system.
5 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of collecting and aggregating log data with fault tolerance are disclosed. One embodiment includes, one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data, wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch. In one embodiment, the agent node further computes a checksum for the batch of multiple messages. The system may further include a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the batch of multiple messages received from the agent node.
-
Citations
20 Claims
-
1. A system for collecting and aggregating log data with fault tolerance, the system comprising:
-
one or more devices that generate log data; one or more machines each having an agent node to collect the log data, wherein, at least one agent node; generates a batch comprising multiple messages from the log data, assigns a tag to the batch, and computes a checksum for the batch of multiple messages; a collector device associated with a collector tier and having a collector node, wherein the at least one agent node sends the batch of multiple messages as a single aggregate event, the checksum and the tag to the collector node, and wherein, the collector node verifies the checksum for the batch of multiple messages received from the at least one agent node, at least one master in communication with the one or more machines and the collector device, wherein, the at least one master, using tags that are assigned to batches, acknowledges that the collector node has written the batches into a file system and informs the at least one agent node that the batches have been safely stored in the file system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of collecting and aggregating log data with fault tolerance, the method comprising:
-
accessing, by one or more devices, log data; collecting, by one or more machines each having an agent node, the log data, wherein, at least one agent node; generates a batch comprising multiple messages from the log data, assigns a tag to the batch, and computes a checksum for the batch of multiple messages; sending, by the at least one agent node, the batch, the checksum and the tag to a collector device having a collector node, wherein the batch is sent as a single aggregate event, wherein the collector device is associated with a collector tier, wherein the collector node verifies the checksum for the batch of multiple messages received from the at least one agent node, acknowledging, by at least one master in communication with the one or more machines and the collector device, that the collector node has written the batches into a file system and informing the at least one agent node that the batches have been safely stored in the file system. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A system for collection and aggregation of log data with fault tolerance, the system comprising:
-
means for accessing, by one or more devices, log data; means for collecting, by one or more machines each having an agent node, the log data, wherein, at least one agent node; generates a batch comprising multiple messages from the log data, assigns a tag to the batch, and computes a checksum for the batch of multiple messages; means for sending, by the at least one agent node, the batch, the checksum and the tag to a collector device having a collector node, wherein the batch is sent as a single aggregate event, wherein the collector device is associated with a collector tier, wherein the collector node verifies the checksum for the batch of multiple messages received from the at least one agent node, means for acknowledging, by at least one master in communication with the one or more machines and the collector device, that the collector node has written the batches into a file system and informing the at least one agent node that the batches have been safely stored in the file system.
-
Specification