Profile-aware filtering of network traffic
First Claim
Patent Images
1. A computer-implemented method for sampling flows observed within traffic traversing a communication link, said method comprising:
- identifying a set of flows observed traversing said communication link, wherein said set of flows has a plurality of dimensions;
creating a plurality of clusters of flows by grouping together flows that share at least one common dimension;
assigning, to at least a portion of said plurality of clusters of flows, a probability value relating to the volume of flows in a cluster;
selecting a probability threshold and an uncertainty threshold, wherein said probability threshold indicates a probability where clusters above the probability threshold are deemed to be significant, and wherein said uncertainty threshold indicates a target level of uncertainty;
removing from said plurality of clusters one or more clusters that are assigned one or more probability values above said probability threshold, wherein the removed clusters are deemed to be significant clusters;
computing a relative uncertainty value for probability values assigned to the remaining clusters in said plurality of clusters, wherein said relative uncertainty value indicates uniformity or variability in said probability values assigned to said remaining clusters in said plurality of clusters;
until said relative uncertainty value exceeds said uncertainty threshold, iteratively decreasing said probability threshold and removing from said remaining clusters in said plurality of clusters one or more clusters that are assigned a probability value above said probability threshold, wherein the removed clusters are deemed to be significant clusters; and
utilizing said significant clusters to identify one or more clusters exhibiting a rare behavior or one more clusters exhibiting an anomalous behavior.
6 Assignments
0 Petitions
Accused Products
Abstract
A system and a method for profiling traffic on a computer network. Flows are observed traversing a communication link. Relative uncertainty values are computed for the dimensions of these flows. These relative uncertainty values are used to identify dominant feature values in the various flow dimensions. Flows having these dominant feature values are filtered.
69 Citations
11 Claims
-
1. A computer-implemented method for sampling flows observed within traffic traversing a communication link, said method comprising:
-
identifying a set of flows observed traversing said communication link, wherein said set of flows has a plurality of dimensions; creating a plurality of clusters of flows by grouping together flows that share at least one common dimension; assigning, to at least a portion of said plurality of clusters of flows, a probability value relating to the volume of flows in a cluster; selecting a probability threshold and an uncertainty threshold, wherein said probability threshold indicates a probability where clusters above the probability threshold are deemed to be significant, and wherein said uncertainty threshold indicates a target level of uncertainty; removing from said plurality of clusters one or more clusters that are assigned one or more probability values above said probability threshold, wherein the removed clusters are deemed to be significant clusters; computing a relative uncertainty value for probability values assigned to the remaining clusters in said plurality of clusters, wherein said relative uncertainty value indicates uniformity or variability in said probability values assigned to said remaining clusters in said plurality of clusters; until said relative uncertainty value exceeds said uncertainty threshold, iteratively decreasing said probability threshold and removing from said remaining clusters in said plurality of clusters one or more clusters that are assigned a probability value above said probability threshold, wherein the removed clusters are deemed to be significant clusters; and utilizing said significant clusters to identify one or more clusters exhibiting a rare behavior or one more clusters exhibiting an anomalous behavior. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method for storing flows observed traversing a network link, the method comprising:
-
utilizing a sampling ratio to select a plurality of flows observed traversing a link on said computer network, wherein said plurality of flows have a plurality of dimensions, wherein the selected flows are stored in a flow table; computing a relative uncertainty value for probability values assigned to clusters of flows stored in said flow table, wherein said relative uncertainty value indicates uniformity or variability in said probability values assigned to said clusters; until said uncertainty value exceeds an uncertainty threshold; (1) removing, from said clusters of flows, clusters whose assigned probability values are above a probability threshold, wherein the removed clusters are deemed to be significant clusters, (2) re-computing said relative uncertainty value, (3) comparing said relative uncertainty value to said uncertainty threshold, and (4) iteratively decreasing said probability threshold when said relative uncertainty value is less than said uncertainty threshold; and utilizing said significant clusters to identify one or more clusters exhibiting a rare behavior or one more clusters exhibiting an anomalous behavior. - View Dependent Claims (8, 9, 10, 11)
-
Specification