MDL-BASED CLUSTERING FOR APPLICATION DEPENDENCY MAPPING
First Claim
1. A method comprising:
- capturing network data using at least a first plurality of sensors executing on a plurality of endpoints of a network and a second plurality of sensors executing on a plurality of networking devices connected to the plurality of endpoints;
determining a graph using the network data, the graph including at least a plurality of nodes corresponding to the plurality of endpoints and a plurality of edges corresponding to a plurality of observed flows included in the network data; and
identifying a first clustering corresponding to a minimum description length (MDL) score that is computed based at least in part on observed edges of the graph and unobserved edges of the graph; and
determining an optimum number of clusters for the plurality of endpoints based on a number of clusters for the first clustering.
1 Assignment
0 Petitions
Accused Products
Abstract
Application dependency mapping (ADM) can be automated in a network. The network can determine an optimum number of clusters for the network using the minimum description length principle (MDL). The network can capture network and associated data using a sensor network that provides multiple perspectives and generate a graph therefrom. The nodes of the graph can include sources, destinations, and destination ports identified in the captured data, and the edges of the graph can include observed flows from the sources to the destinations at the destination ports. Each clustering can be evaluated according to an MDL score. The optimum number of clusters for the network may correspond to the number of clusters of the clustering associated with the minimum MDL score.
-
Citations
20 Claims
-
1. A method comprising:
-
capturing network data using at least a first plurality of sensors executing on a plurality of endpoints of a network and a second plurality of sensors executing on a plurality of networking devices connected to the plurality of endpoints; determining a graph using the network data, the graph including at least a plurality of nodes corresponding to the plurality of endpoints and a plurality of edges corresponding to a plurality of observed flows included in the network data; and identifying a first clustering corresponding to a minimum description length (MDL) score that is computed based at least in part on observed edges of the graph and unobserved edges of the graph; and determining an optimum number of clusters for the plurality of endpoints based on a number of clusters for the first clustering. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a processor; and memory including instructions that, upon being executed by the processor, cause the system to; capture network data using at least a first respective sensor executing on each of a plurality of endpoints of a network and a second respective sensor executing on each of a plurality of networking devices connected to the plurality of endpoints; determine a graph using the network data, the graph including at least a plurality of nodes, corresponding to the plurality of endpoints and a plurality of endpoint ports, and a plurality of edges corresponding to a plurality of observed flows included in the network data; and determine an optimum number of clusters for the plurality of endpoints based on a number of clusters of a first clustering that corresponds to a minimum description length (MDL) score. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory computer-readable medium having computer readable instructions that, upon being executed by a processor, cause the processor to:
-
capture network data using at least a first plurality of sensors executing on a plurality of endpoints of a network and a second plurality of sensors executing on a plurality of networking devices connected to the plurality of endpoints; determine a graph using the network data, the graph including at least a plurality of nodes, corresponding to the plurality of endpoints and a plurality of endpoint ports, and a plurality of edges corresponding to a plurality of observed flows included in the network data; identify a first clustering corresponding to a minimum description length (MDL) score that is computed based at least in part on observed edges of the graph and unobserved edges of the graph; and determine an optimum number of clusters for the plurality of endpoints based on a number of clusters for the first clustering. - View Dependent Claims (18, 19, 20)
-
Specification