Detecting network anomalies using node scoring
First Claim
1. A computer-implemented method comprising:
- receiving training data, the training data having a data structure including a first directed graph, the first directed graph having first nodes and first directed edges between the first nodes, each of the first nodes representing a respective first computer in a network in a first time period, each of the first directed edges representing a respective first request from a first computer to access another first computer that occurred in the first time period;
determining, using a link analysis algorithm, a respective first node score for each of the first nodes in the first directed graph, each of the respective first node scores indicating a respective popularity of each of the first nodes during the first time period;
receiving test data, the test data having a data structure including a second directed graph, the second directed graph having second nodes and second directed edges, each of the second nodes representing a respective second computer in the network in a second time period, each of the second edges representing a respective second request from a second computer to access another second computer that occurred in the second time period;
generating a reduced second directed graph including;
identifying edges that appear respectively between nodes occurring in both the first directed graph and the second directed graph, andremoving, from the second directed graph, the identified edges that also appear in the first directed graph;
determining, using the link analysis algorithm, a respective second node score for each of the second nodes in the reduced second directed graph, each of the respective second node scores indicating a respective popularity of each of the second nodes during the second time period;
for each particular node that is in the first directed graph and the reduced second directed graph, determining a respective difference between the first node score of the particular node and the second node score of the particular node; and
designating a particular computer corresponding to a particular node as an anomalous computer upon determining that the difference corresponding to the particular node exceeds a threshold value,wherein the method is performed by one or more computers.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and computer program products for detecting network anomalies using node scoring. A network analysis system designates each server computer in a distributed computing network as a node in a graph. The network analysis system constructs a first graph of the distributed computing network for a training period. The system then determines a respective first node score of each node. The system constructs a second graph of the distributed computing network for a test time period. The system then reduces the second graph by removing those edges from the second graph that appear in both the first graph and the second graph. The system determines a respective second node score of each node in the reduced second graph. The system computes differences between the first node scores and the second node scores. The system designates nodes associated with the highest differences as anomalous nodes.
-
Citations
14 Claims
-
1. A computer-implemented method comprising:
-
receiving training data, the training data having a data structure including a first directed graph, the first directed graph having first nodes and first directed edges between the first nodes, each of the first nodes representing a respective first computer in a network in a first time period, each of the first directed edges representing a respective first request from a first computer to access another first computer that occurred in the first time period; determining, using a link analysis algorithm, a respective first node score for each of the first nodes in the first directed graph, each of the respective first node scores indicating a respective popularity of each of the first nodes during the first time period; receiving test data, the test data having a data structure including a second directed graph, the second directed graph having second nodes and second directed edges, each of the second nodes representing a respective second computer in the network in a second time period, each of the second edges representing a respective second request from a second computer to access another second computer that occurred in the second time period; generating a reduced second directed graph including; identifying edges that appear respectively between nodes occurring in both the first directed graph and the second directed graph, and removing, from the second directed graph, the identified edges that also appear in the first directed graph; determining, using the link analysis algorithm, a respective second node score for each of the second nodes in the reduced second directed graph, each of the respective second node scores indicating a respective popularity of each of the second nodes during the second time period; for each particular node that is in the first directed graph and the reduced second directed graph, determining a respective difference between the first node score of the particular node and the second node score of the particular node; and designating a particular computer corresponding to a particular node as an anomalous computer upon determining that the difference corresponding to the particular node exceeds a threshold value, wherein the method is performed by one or more computers. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
-
receiving training data, the training data having a data structure including a first directed graph, the first directed graph having first nodes and first directed edges between the first nodes, each of the first nodes representing a respective first computer in a network in a first time period, each of the first directed edges representing a respective first request from a first computer to access another first computer that occurred in the first time period; determining, using a link analysis algorithm, a respective first node score for each of the first nodes in the first directed graph, each of the respective first node scores indicating a respective popularity of each of the first nodes during the first time period; receiving test data, the test data having a data structure including a second directed graph, the second directed graph having second nodes and second directed edges, each of the second nodes representing a respective second computer in the network in a second time period, each of the second edges representing a respective second request from a second computer to access another second computer that occurred in the second time period; generating a reduced second directed graph including; identifying edges that appear respectively between nodes occurring in both the first directed graph and the second directed graph, and removing, from the second directed graph, the identified edges that also appear in the first directed graph; determining, using the link analysis algorithm, a respective second node score for each of the second nodes in the reduced second directed graph, each of the respective second node scores indicating a respective popularity of each of the second nodes during the second time period; for each particular node that is in the first directed graph and the reduced second directed graph, determining a respective difference between the first node score of the particular node and the second node score of the particular node; and designating a particular computer corresponding to a particular node as an anomalous computer upon determining that the difference corresponding to the particular node exceeds a threshold value. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A non-transitory computer storage medium encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
-
receiving training data, the training data having a data structure including a first directed graph, the first directed graph having first nodes and first directed edges between the first nodes, each of the first nodes representing a respective first computer in a network in a first time period, each of the first directed edges representing a respective first request from a first computer to access another first computer that occurred in the first time period; determining, using a link analysis algorithm, a respective first node score for each of the first nodes in the first directed graph, each of the respective first node scores indicating a respective popularity of each of the first nodes during the first time period; receiving test data, the test data having a data structure including a second directed graph, the second directed graph having second nodes and second directed edges, each of the second nodes representing a respective second computer in the network in a second time period, each of the second edges representing a respective second request from a second computer to access another second computer that occurred in the second time period; generating a reduced second directed graph including; identifying edges that appear respectively between nodes occurring in both the first directed graph and the second directed graph, and removing, from the second directed graph, the identified edges that also appear in the first directed graph; determining, using the link analysis algorithm, a respective second node score for each of the second nodes in the reduced second directed graph, each of the respective second node scores indicating a respective popularity of each of the second nodes during the second time period; for each particular node that is in the first directed graph and the reduced second directed graph, determining a respective difference between the first node score of the particular node and the second node score of the particular node; and designating a particular computer corresponding to a particular node as an anomalous computer upon determining that the difference corresponding to the particular node exceeds a threshold value. - View Dependent Claims (12, 13, 14)
-
Specification