Graph based detection of anomalous activity
First Claim
1. A computer-implemented method, comprising:
- accessing event data describing a plurality of events associated with a plurality of host devices, individual ones of the plurality of events including an interaction between a first entity and a second entity, the first and second entities including two of;
a host device of the plurality of host devices;
a process that is executable on at least one of the host devices;
ora service that is provided by the plurality of host devices, the service comprising a plurality of executable processes;
generating a graph based at least partially on the event data, the graph relating the plurality of events, wherein the graph includes;
a plurality of vertices, individual ones of the plurality of vertices associated with one of the first or second entities included in the plurality of events; and
a plurality of edges, individual ones of the plurality of edges indicating an event by connecting two of the plurality of vertices corresponding to the first and second entities included in the event, individual ones of the plurality of edges including;
a type attribute indicating a type of the event; and
a timestamp attribute indicating a time when the event occurred;
determining a rarity metric associated with one or more of the plurality of edges, the rarity metric of an edge being based on a number of edges that include a same type attribute as the edge and that connect a same two vertices as the edge;
determining a risk metric of one or more of the plurality of edges, the risk metric of the edge indicating a degree of security risk associated with the event corresponding to the edge;
determining a start vertex associated with an earliest timestamp attribute within a time period;
traversing the graph, beginning from the start vertex, along at least a portion of the plurality of edges, wherein each successive traversed edge includes a timestamp attribute greater than that of a previously traversed edge to identify a subset of the plurality of edges for which the rarity metric satisfies a rarity threshold and the risk metric satisfies a risk threshold;
determining the subset of the plurality of edges for which the rarity metric satisfies the rarity threshold and the risk metric satisfies the risk threshold to indicate anomalous activity;
storing data indicative of the subset of the plurality of edges as anomalous activity data for determining a pattern of anomalous activity within the plurality of host devices; and
performing one or more interdiction operations based on the anomalous activity data, the one or more interdiction operations including suspending a communication with at least one host device indicated by the anomalous activity data.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described for graph-based analysis of event data in a computing environment. Event data is collected from host devices, the event data describing events in which devices, processes, or services are accessed in the environment. The event data is arranged into a graph that includes vertices corresponding to devices, processes, or services, and edges that connect pairs of vertices. Each edge may identify an event by connecting two vertices corresponding to two devices, processes, or services included in the event. A rarity metric is determined for each edge, indicating a rarity of events of a particular type connecting two vertices. A risk metric may also be determined for each edge, indicating a security risk associated with the event type or the target of the event. The graph may be traversed according to the risk and rarity metrics, to identify patterns of anomalous activity in the event data.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
accessing event data describing a plurality of events associated with a plurality of host devices, individual ones of the plurality of events including an interaction between a first entity and a second entity, the first and second entities including two of; a host device of the plurality of host devices; a process that is executable on at least one of the host devices;
ora service that is provided by the plurality of host devices, the service comprising a plurality of executable processes; generating a graph based at least partially on the event data, the graph relating the plurality of events, wherein the graph includes; a plurality of vertices, individual ones of the plurality of vertices associated with one of the first or second entities included in the plurality of events; and a plurality of edges, individual ones of the plurality of edges indicating an event by connecting two of the plurality of vertices corresponding to the first and second entities included in the event, individual ones of the plurality of edges including; a type attribute indicating a type of the event; and a timestamp attribute indicating a time when the event occurred; determining a rarity metric associated with one or more of the plurality of edges, the rarity metric of an edge being based on a number of edges that include a same type attribute as the edge and that connect a same two vertices as the edge; determining a risk metric of one or more of the plurality of edges, the risk metric of the edge indicating a degree of security risk associated with the event corresponding to the edge; determining a start vertex associated with an earliest timestamp attribute within a time period; traversing the graph, beginning from the start vertex, along at least a portion of the plurality of edges, wherein each successive traversed edge includes a timestamp attribute greater than that of a previously traversed edge to identify a subset of the plurality of edges for which the rarity metric satisfies a rarity threshold and the risk metric satisfies a risk threshold; determining the subset of the plurality of edges for which the rarity metric satisfies the rarity threshold and the risk metric satisfies the risk threshold to indicate anomalous activity; storing data indicative of the subset of the plurality of edges as anomalous activity data for determining a pattern of anomalous activity within the plurality of host devices; and performing one or more interdiction operations based on the anomalous activity data, the one or more interdiction operations including suspending a communication with at least one host device indicated by the anomalous activity data. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system, comprising:
at least one computing device configured to implement one or more services, wherein the one or more services are configured to; generate a graph corresponding to a plurality of events, individual ones of the plurality of events including an interaction between entities present in a plurality of host devices, the entities including at least two of;
a host device, a process, or a service, the graph comprising;a plurality of vertices, individual ones of the plurality of vertices associated with one of the entities included in the plurality of events; and a plurality of edges, individual ones of the plurality of edges indicating an event by connecting two of the plurality of vertices corresponding to the entities included in the event, individual ones of the plurality of edges including a timestamp attribute indicating a time when the event occurred; determine a rarity metric of one or more of the plurality of edges, the rarity metric of an edge being based at least partly on a number of edges that connect a same two vertices as the edge; traverse the graph along at least a portion of the plurality of edges in an order in which each successive edge traversed includes the timestamp attribute greater than that of a previously traversed edge to identify a subset of the plurality of edges for which the rarity metric satisfies a rarity threshold; determine the subset of the plurality of edges for which the rarity metric satisfies the rarity threshold to indicate anomalous activity; store data indicative of the subset of the plurality of edges as anomalous activity data for determining a pattern of anomalous activity within the plurality of host devices; and suspend a communication with at least one host device indicated by the anomalous activity data. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
15. One or more non-transitory computer-readable media storing instructions which, when executed by at least one processor, instruct the at least one processor to perform actions comprising:
-
generating a graph corresponding to a plurality of events, individual ones of the plurality of events including interactions between entities present in a plurality of host devices, the entities including at least two of;
a host device, a process, or a service, the graph including;a plurality of vertices, individual ones of the plurality of vertices associated with one of the entities included in the plurality of events; and a plurality of edges, individual ones of the plurality of edges indicating an event by connecting two of the plurality of vertices corresponding to the entities included in the event, individual ones of the plurality of edges including a timestamp attribute indicating a time when the event occurred; determining a risk metric of one or more of the plurality of edges, the risk metric of an edge indicating a degree of security risk associated with the event corresponding to the edge; traversing the graph along at least a portion of the plurality of edges in an order in which each successive edge traversed includes a timestamp attribute greater than that of a previously traversed edge to identify a subset of the plurality of edges for which the risk metric satisfies a risk threshold; determining the subset of the plurality of edges for which the risk metric satisfies the risk threshold to indicate anomalous activity; storing data indicative of the subset of the plurality of edges as anomalous activity data for determining a pattern of anomalous activity within the plurality of host devices; and performing one or more interdiction operations based on the anomalous activity data, the one or more interdiction operations including suspending communications with at least one host device indicated by the anomalous activity data. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification