×

System and method for graph based monitoring and management of distributed systems

  • US 10,353,800 B2
  • Filed: 10/18/2017
  • Issued: 07/16/2019
  • Est. Priority Date: 10/18/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • generating, by one or more processors, for each instance of a plurality of instances of a moving window, first metrics indicative of at least one of central processing unit (CPU) utilization, memory utilization, or disk utilization by each of a plurality of servers of a distributed streaming system and second metrics indicative of at least one of throughput or latency of each of the plurality of servers of the distributed streaming system;

    generating, by the one or more processors, a topology graph including a plurality of vertices representing the plurality of servers and a plurality of edges representing data flow among the plurality of servers;

    generating, by the one or more processors, at least one first metrics graph corresponding to the first metrics for each server of the plurality of servers and for each instance of the plurality of instances of the moving window based in part on the topology graph, wherein each vertex of the first metrics graph represents one of the servers of the plurality of servers and each edge between each pair of vertices of the first metrics graph is indicative of the first metrics of a first server represented by a first vertex of the pair of vertices of the first metrics graph being within a predetermined threshold of the first metrics of a second server represented by a second vertex of the pair of vertices of the first metrics graph;

    generating, by the one or more processors, at least one second metrics graph corresponding to the second metrics for each server of the plurality of servers and for each instance of the plurality of instances of the moving window based in part on the topology graph wherein each vertex of the second metrics graph represents one of the servers of the plurality of servers and each edge between each pair of vertices of the second metrics graph is indicative of the second metrics of a first server represented by a first vertex of the pair of vertices of the second metrics graph being within a predetermined threshold of the second metrics of a second server represented by a second vertex of the pair of vertices of the second metrics graph;

    identifying, by the one or more processors, one or more differences between at least one of the first metrics graph at a first instance of the plurality of instances of the moving window and the first metrics graph at a second instance of the plurality of instances of the moving window or the second metrics graph at the first instance and the second metrics graph at the second instance; and

    displaying, by the one or more processors, the one or more differences as indicative of a malfunction of the distributed streaming system.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×