Anomaly detection in groups of network addresses
First Claim
1. A method for identifying anomalies in a group of network addresses, comprising:
- inputting, with a data processor, a plurality of network addresses;
parsing said plurality of network addresses, with said data processor, into at least one tree data structure, each tree data structure comprising a plurality of nodes wherein successive nodes in said tree data structure represent successive portions of said network addresses;
during said parsing, assigning a respective ripeness score to each of said nodes, said respective ripeness score indicating a number of occurrences of each of said nodes in said plurality of network addresses;
building a model of normal behavior from tree data structure nodes assigned respective ripeness scores within a specified range of ripeness scores and excluding from said tree data structure nodes with assigned respective ripeness score outside said specified range; and
for an input network address;
traversing said model of network behavior along said input network address;
identifying whether said input network address is anomalous based on a deviation of said network address from said traversed model, said deviation being zero when said traversing said model of network behavior along said input network address leads to a leaf node;
when an anomalous network address is identified, calculating an abnormality score indicating said deviation of said anomalous network address from said model and reclassifying said anomalous network address as normal when said abnormality score is below a specified level; and
when said tree data structure comprises less than specified number of leaves and at least some of said leaves have respective ripeness scores greater than a specified ripeness score, recalculating said abnormality score for said identified anomalous network address.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for identifying anomalies in a group of network addresses includes building a model of the group of network addresses and identifying a network address as anomalous based on the deviation of the network address from the model. The model is built from a group of network addresses. The network addresses are input and parsed into one or more address trees. A ripeness score is maintained for each of the nodes in the address trees, based, at least in part, on the number of occurrences of the network address portion represented by the node. Nodes having respective ripeness scores within a specified range are classified as ripe nodes, and may be indicative of normal behavior, and nodes having respective ripeness scores outside the specified range of ripeness scores are classified as unripe, and may be indicative of anomalous behavior.
-
Citations
23 Claims
-
1. A method for identifying anomalies in a group of network addresses, comprising:
-
inputting, with a data processor, a plurality of network addresses; parsing said plurality of network addresses, with said data processor, into at least one tree data structure, each tree data structure comprising a plurality of nodes wherein successive nodes in said tree data structure represent successive portions of said network addresses; during said parsing, assigning a respective ripeness score to each of said nodes, said respective ripeness score indicating a number of occurrences of each of said nodes in said plurality of network addresses; building a model of normal behavior from tree data structure nodes assigned respective ripeness scores within a specified range of ripeness scores and excluding from said tree data structure nodes with assigned respective ripeness score outside said specified range; and for an input network address; traversing said model of network behavior along said input network address; identifying whether said input network address is anomalous based on a deviation of said network address from said traversed model, said deviation being zero when said traversing said model of network behavior along said input network address leads to a leaf node; when an anomalous network address is identified, calculating an abnormality score indicating said deviation of said anomalous network address from said model and reclassifying said anomalous network address as normal when said abnormality score is below a specified level; and when said tree data structure comprises less than specified number of leaves and at least some of said leaves have respective ripeness scores greater than a specified ripeness score, recalculating said abnormality score for said identified anomalous network address. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for identifying anomalies in a group of network addresses, comprising:
-
a non-transient computer-readable storage medium storing code instructions; and a processor coupled to said storage medium and adapted to execute the stored code, the code comprising; instructions for inputting a plurality of network addresses; instructions for parsing a plurality of network addresses into at least one data structure, each tree data structure comprising a plurality of nodes wherein successive nodes in said tree data structure represent successive portions of the network address; instructions for, during said parsing, assigning a respective ripeness score to each of said nodes, said respective ripeness score indicating a number of occurrences of each of said nodes in said plurality of network addresses; instructions for building a model of normal behavior from tree data structure nodes assigned respective ripeness scores within a specified range of ripeness scores and excluding from said tree data structure nodes with assigned respective ripeness score outside said specified range; instructions for traversing said model of network behavior along an input network address; instructions for identifying whether said input network address is anomalous normal based on a deviation of said network address from said traversed model, said deviation being zero when said traversing said model of network behavior along said input network address leads to a leaf node; instructions for, when an anomalous network address is identified, calculating an abnormality score indicating said deviation of said anomalous network address from said model, and reclassifying said anomalous network address as normal when said abnormality score is below a specified level; and instructions for, when said tree data structure corn rises less than specified number of leaves and at least some of said leaves have respective ripeness scores greater than a specified ripeness score, recalculating said abnormality score for said identified anomalous network address. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A computer program product for identifying anomalies in a group of network addresses, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a data processor to cause the processor to:
-
input, with a data processor, a plurality of network addresses; parse said plurality of network addresses into at least one tree data structure, each tree data structure comprising a plurality of nodes wherein successive nodes in said tree data structure represent successive portions of said network addresses; assign a respective ripeness score to each of said nodes during said parsing, said respective ripeness score indicating a number of occurrences of each of said nodes in said plurality network addresses; build a model of normal behavior from tree data structure nodes assigned respective ripeness scores within a specified range of ripeness scores and excluding from said tree data structure nodes with assigned respective ripeness scores outside said specified range; for an input network address; traverse said model of network behavior along said input network address; and identify, with said data processor, whether said input network address is anomalous based on a deviation of said network address from said traversed model, said deviation being zero when said traversing said model of network behavior along said input network address leads to a leaf node; calculate, when an anomalous network address is identified, an abnormality score indicating a deviation of anomalous network address from had model and reclassify said anomalous network address as normal when said abnormality score is below a specified level; and when said tree data structure comprises less than specified number of leaves and at least some of said leaves have respective ripeness scores greater than a specified ripeness score, recalculate said abnormality score for said identified anomalous network address. - View Dependent Claims (23)
-
Specification