Quorum based distributed anomaly detection and repair using distributed computing by stateless processes
First Claim
Patent Images
1. A method comprising:
- detecting, by a quorum, an anomaly among a group of collectors and aggregators in a network, the group of collectors configured to receive metric data from a plurality of agents on one or more remote servers in the computer network and the group of aggregators configured to receive one or more selected metrics from the one or more collectors;
responsive to detecting the anomaly, causing a plurality of producers to initiate a repair task generation process;
receiving, at the quorum, a query from at least one producer of the plurality of producers;
determining, by the quorum, whether the repair task has already been generated by another producer of the plurality of producers;
preventing, by the quorum, the at least one producer from generating the repair task when another producer of the plurality of producers has already generated the repair task;
causing, by the quorum, the at least one producer to generate the repair task when another producer has not already generated the repair task;
detecting that the repair task has been generated by at least one producer of the plurality of producers;
assigning the generated repair task to a worker; and
completing the generated repair task at the assigned worker.
3 Assignments
0 Petitions
Accused Products
Abstract
Quorum based anomaly detection utilizes multiple entities to detect and attempt to configure a repair task for an anomaly. Once the repair task is generated, a system is used to assign the task to a worker entity while recording the responsibility of that task with the worker in a persistent storage. If the worker entity crashes, the degraded worker status will eventually be detected, and all tasks associated with that worker will be re-assigned. Once a worker finishes a task, the assignment information for the task is transitioned to a completed state.
-
Citations
21 Claims
-
1. A method comprising:
-
detecting, by a quorum, an anomaly among a group of collectors and aggregators in a network, the group of collectors configured to receive metric data from a plurality of agents on one or more remote servers in the computer network and the group of aggregators configured to receive one or more selected metrics from the one or more collectors; responsive to detecting the anomaly, causing a plurality of producers to initiate a repair task generation process; receiving, at the quorum, a query from at least one producer of the plurality of producers; determining, by the quorum, whether the repair task has already been generated by another producer of the plurality of producers; preventing, by the quorum, the at least one producer from generating the repair task when another producer of the plurality of producers has already generated the repair task; causing, by the quorum, the at least one producer to generate the repair task when another producer has not already generated the repair task; detecting that the repair task has been generated by at least one producer of the plurality of producers; assigning the generated repair task to a worker; and completing the generated repair task at the assigned worker. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable storage medium having embodied thereon a program, the program being executable by a processor to perform a method for processing metrics, the method comprising:
-
detecting, at a quorum, an anomaly among a group of collectors and aggregators in a network, the group of collectors configured to receive metric data from a plurality of agents on one or more remote servers in the computer network and the group of aggregators configured to receive one or more selected metrics from the one or more collectors; responsive to detecting the anomaly, causing a plurality of producers to initiate a repair task generation process; receiving, at the quorum, a query from at least one producer of the plurality of producers; determining, at the quorum, whether the repair task has already been generated by another producer of the plurality of producers; preventing, at the quorum, the at least one producer from generating the repair task when another producer of the plurality of producers has already generated the repair task; causing, at the quorum, the at least one producer to generate the repair task when another producer has not already generated the repair task; detecting that the repair task has been generated by at least one producer of the plurality of producers; assigning the generated repair task to a worker; and completing the generated repair task at the assigned worker. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
one or more network interfaces to communicate in a computer network; a processor coupled to the network interfaces and configured to execute one or more processes; and a memory configured to store a process executable by the processor, the process when executed operable to; detect, at a quorum, an anomaly among a group of collectors and aggregators in a network, the group of collectors configured to receive metric data from a plurality of agents on one or more remote servers in the computer network and the group of aggregators configured to receive one or more selected metrics from the one or more collectors; cause a plurality of producers to initiate a repair task generation process responsive to detecting the anomaly; receive, at a quorum a query from at least one producer of the plurality of producers; determine, at the quorum, whether the repair task has already been generated by another producer of the plurality of producers; prevent, at the quorum, the at least one producer from generating the repair task when another producer of the plurality of producers has already generated the repair task; cause the at least one producer to generate the repair task when another producer has not already generated the repair task; detect that the repair task has been generated by at least one producer of the plurality of producers; assign the generated repair task to a worker; and complete the generated repair task at the assigned worker. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification