Cluster performance monitoring
First Claim
1. A computer-implemented method comprising:
- receiving machine data from a computing cluster, the computing cluster including a plurality of computational cluster nodes coordinating in operation;
generating time stamped events from the received machine data, each time stamped event having a time stamp derived from time stamp data parsed from the received machine data;
analyzing, for each computational cluster node of the plurality of computational cluster nodes, a metric characterizing an aspect of computational performance of the computational cluster node, wherein the metric is analyzed based on values included in a set of time stamped events;
computing an event pattern using analyzed metrics;
monitoring whether the event pattern is indicative of a previously determined or known problem for operation of the computing cluster using a heuristic analysis; and
generating a notification when the event pattern is indicative of a previously determined or known problem for operation of the computing cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments are directed towards the visualization of machine data received from computing clusters. Embodiments may enable improved analysis of computing cluster performance, error detection, troubleshooting, error prediction, or the like. Individual cluster nodes may generate machine data that includes information and data regarding the operation and status of the cluster node. The machine data is received from each cluster node for indexing by one or more indexing applications. The indexed machine data including the complete data set may be stored in one or more index stores. A visualization application enables a user to select one or more analysis lenses that may be used to generate visualizations of the machine data. The visualization application employs the analysis lens to produce visualizations of the computing cluster machine data.
-
Citations
24 Claims
-
1. A computer-implemented method comprising:
-
receiving machine data from a computing cluster, the computing cluster including a plurality of computational cluster nodes coordinating in operation; generating time stamped events from the received machine data, each time stamped event having a time stamp derived from time stamp data parsed from the received machine data; analyzing, for each computational cluster node of the plurality of computational cluster nodes, a metric characterizing an aspect of computational performance of the computational cluster node, wherein the metric is analyzed based on values included in a set of time stamped events; computing an event pattern using analyzed metrics; monitoring whether the event pattern is indicative of a previously determined or known problem for operation of the computing cluster using a heuristic analysis; and generating a notification when the event pattern is indicative of a previously determined or known problem for operation of the computing cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A network device comprising:
-
a device, implemented at least partially in hardware, that receives machine data from a computing cluster, the computing cluster including a plurality of computational cluster nodes coordinating in operation; a device, implemented at least partially in hardware, that generates time stamped events from the received machine data, each time stamped event having a time stamp derived from time stamp data parsed from the received machine data; a device, implemented at least partially in hardware, that analyzes, for each computational cluster node of the plurality of computational cluster nodes, a metric characterizing an aspect of computational performance of the computational cluster node, wherein the metric is analyzed based on values included in a set of time stamped events; a device, implemented at least partially in hardware, that computes an event pattern using analyzed metrics; a device, implemented at least partially in hardware, that monitors whether the event pattern is indicative of a previously determined or known problem for operation of the computing cluster using a heuristic analysis; and a device, implemented at least partially in hardware, that generates a notification when the event pattern is indicative of the previously determined or known problem for operation of the computing cluster. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitive storage medium that includes a plurality of instructions, wherein execution of at least a portion of the instructions by a processor device enables a plurality of actions, the actions comprising:
-
receiving machine data from a computing cluster, the computing cluster including a plurality of computational cluster nodes coordinating in operation; generating time stamped events from the received machine data, each time stamped event having a time stamp derived from time stamp data parsed from the received machine data; analyzing, for each computational cluster node of the plurality of computational cluster nodes, a metric characterizing an aspect of computational performance of the computational cluster node, wherein the metric is analyzed based on values included in a set of time stamped events; computing an event pattern using analyzed metrics; monitoring whether the event pattern is indicative of a previously determined or known problem for operation of the computing cluster using a heuristic analysis; and generating a notification when the event pattern is indicative of a previously determined or known problem for operation of the computing cluster. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A system comprising:
-
a plurality of nodes; and a network device, including; a memory device for storing instructions; and a processor device that executes at least a portion of the stored instructions to enable a plurality of actions, the actions including; receiving machine data from a computing cluster, the computing cluster including a plurality of computational cluster nodes coordinating in operation; generating time stamped events from the received machine data, each time stamped event having a time stamp derived from time stamp data parsed from the received machine data; analyzing, for each computational cluster node of the plurality of computational cluster nodes, a metric characterizing an aspect of computational performance of the computational cluster node, wherein the metric is analyzed based on values included in a set of time stamped events; computing an event pattern using analyzed metrics; monitoring whether the event pattern is indicative of a previously determined or known problem for operation of the computing cluster using a heuristic analysis; and generating a notification when the event pattern is indicative of the previously determined or known problem for operation of the computing cluster. - View Dependent Claims (23, 24)
-
Specification