Sequence-based anomaly detection using a distance matrix
First Claim
1. An intrusion detection method, comprising the steps of:
- (a) defining a distance matrix that specifies known separations between pairs of events;
(b) comparing the separation of a new event with each of the previous N-1 events using said distance matrix; and
(c) determining based on said comparison whether an anomaly has occurred;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not.
5 Assignments
0 Petitions
Accused Products
Abstract
A real-time sequence-based anomaly detection system is disclosed. In a preferred embodiment, the intrusion detection system is incorporated as part of a software wrapper. Event abstraction in the software wrapper enables the intrusion detection system to apply generically across various computing platforms. Real-time anomaly detection is enabled through the definition of a distance matrix that defines allowable separation distances between pairs of system calls. The distance matrix indirectly specifies known sequences of system calls and can be used to determine whether a sequence of system calls in an event window represents an anomaly. Anomalies that are detected are further analyzed through levenshtein distance calculations that also rely on the contents of the distance matrix.
135 Citations
16 Claims
-
1. An intrusion detection method, comprising the steps of:
-
(a) defining a distance matrix that specifies known separations between pairs of events;
(b) comparing the separation of a new event with each of the previous N-1 events using said distance matrix; and
(c) determining based on said comparison whether an anomaly has occurred;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (2)
-
-
3. A computer program product for enabling a processor in a computer system to implement an intrusion detection system, said computer program product comprising:
-
a computer usable medium having computer readable program code means embodied in said medium for causing a program to execute on the computer system, said computer readable program code means comprising;
means for enabling the computer system to define a distance matrix that specifies known separations between pairs of events;
means for enabling the computer system to compare the separation of a new event with each of the previous N−
1 events using said distance matrix; and
means for enabling the computer system to determine based on said comparison whether an anomaly has occurred;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (4, 5)
-
-
6. A computer data signal embodied in a transmission medium comprising:
-
computer-readable program code for causing a computer to define a distance matrix that specifies known separations between pairs of events;
computer-readable program code for causing a computer to compare the separation of a new event with each of the previous N-1 events using said distance matrix; and
computer-readable program code for causing a computer to determine based on said comparison whether an anomaly has occurred;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (7)
-
-
8. A method for determining the magnitude of an anomaly detected by an intrusion detection system, comprising the steps of:
-
(a) defining a distance matrix that specifies known separations between pairs of events; and
(b) determining a levenshtein distance between an event sequence in an event window and at least one event sequence that is indirectly defined using said distance matrix;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (9, 10)
(i) selecting a first event as a test event; and
(ii) determining the values of a levenshtein matrix d(i) using the recursive equation d(i,j)=min (d(i−
1,j)+m, d(i,j−
1)+n, d(i−
1,j−
1)+r(s(i), t(j))), where r(a,b)=0 if a=b, and r(a,b)=p otherwise, wherein said value of r(s(i), t(j)) is determined by accessing said distance matrix using said test event and an event in said event window.
-
-
11. A computer program product for enabling a processor in a computer system to determine the magnitude of an anomaly detected by an intrusion detection system, said computer program product comprising:
-
a computer usable medium having computer readable program code means embodied in said medium for causing a program to execute on the computer system, said computer readable program code means comprising;
first means for enabling the computer system to define a distance matrix that specifies known separations between pairs of events; and
second means for enabling the computer system to determine a levenshtein distance between an event sequence in an event window and at least one event sequence that is indirectly defined using said distance matrix;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (12, 13)
means for enabling the computer system to select a first event as a test event; and
means for enabling the computer system to determine the values of a levenshtein matrix d(i) using the recursive equation d(i,j)=min (d(i−
1,j)+m, d(i,j−
1)+n, d(i−
1, j−
1)+r(s(i), t(j))), where r(a,b)=0 if a=b, and r(a,b)=p otherwise, wherein said value of r(s(i), t(j)) is determined by accessing said distance matrix using said test event and an event in said event window.
-
-
14. A computer data signal embodied in a transmission medium comprising:
-
a first computer readable program code for enabling the computer system to define a distance matrix that specifies known separations between pairs of events; and
a second computer readable program code for enabling the computer system to determine a levenshtein distance between an event sequence in an event window and at least one event sequence that is indirectly defined using said distance matrix;
wherein the distance matrix involves a levenshtein distance calculation that counts the differences between two computer-related strings, where the differences are counted when one string has a character whereas the other string does not. - View Dependent Claims (15, 16)
computer readable program code for enabling the computer system to select a first event as a test event; and
computer readable program code for enabling the computer system to determine the values of levenshtein matrix d(i) using the recursive equation d(i,j)=min (d(i−
1,j)+m, d(i,j−
1)+n, d(i−
1,j−
1)+r(s(i), t(j))), where r(a,b)=0 if a=b, and r(a,b)=p otherwise, wherein said value of r(s(i), t(j)) is determined by accessing said distance matrix using said test event and an event in said event window.
-
Specification