Anomaly detection in dynamically evolving data and systems

US 9,843,596 B1
Filed: 07/03/2015
Issued: 12/12/2017
Est. Priority Date: 11/02/2007
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

a) obtaining a data set of network traffic comprising N-dimensional data points from a traffic analyzer, wherein N>

3 and wherein the traffic analyzer is configured to generate a statistics matrix comprising the N-dimensional data points;

b) by a computer,i. processing the statistics matrix into a Markov kernel matrix,ii. processing the Markov kernel matrix to obtain processed data points with a dimension r lower than N, wherein the processing includes finding r discriminating eigenvectors by providing i=1, . . . , r eigenvalues and respective associated eigenvectors, generating for each i=2, . . . , r a respective i^thcluster based on the i^theigenvector and generating other respective clusters based on eigenvectors 1, . . . , i−

1, i+1, . . . , r, computing a distance between each respective i^thcluster based on the i^theigenvector and each of the other respective clusters based on eigenvectors 1, . . . , i−

1, i−

1, . . . , r, and finding r eigenvalues and associated respective eigenvectors that provide the highest distance, the associated respective eigenvectors that provide the highest distance being the discriminating eigenvectors,wherein the r discriminating eigenvectors thus found form an embedded space in which processed data points with a reduced dimension r form a normal cluster,iii. detecting an abnormal data point in the processed data points with a dimension r lower than N without relying on a signature of a threat and without use of a threshold, andiv. blocking the abnormal data point.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Detection of abnormalities in multi-dimensional data is performed by processing the multi-dimensional data to obtain a reduced dimension embedding matrix, using the reduced dimension embedding matrix to form a lower dimension (of at least 2D) embedded space, applying an out-of-sample extension procedure in the embedded space to compute coordinates of a newly arrived data point and using the computed coordinates of the newly arrived data point and Euclidean distances to determine whether the newly arrived data point is normal or abnormal.

Citations

18 Claims

1. A method, comprising:
- a) obtaining a data set of network traffic comprising N-dimensional data points from a traffic analyzer, wherein N>
  
  3 and wherein the traffic analyzer is configured to generate a statistics matrix comprising the N-dimensional data points;
  
  b) by a computer,i. processing the statistics matrix into a Markov kernel matrix,ii. processing the Markov kernel matrix to obtain processed data points with a dimension r lower than N, wherein the processing includes finding r discriminating eigenvectors by providing i=1, . . . , r eigenvalues and respective associated eigenvectors, generating for each i=2, . . . , r a respective i^thcluster based on the i^theigenvector and generating other respective clusters based on eigenvectors 1, . . . , i−
  
  1, i+1, . . . , r, computing a distance between each respective i^thcluster based on the i^theigenvector and each of the other respective clusters based on eigenvectors 1, . . . , i−
  
  1, i−
  
  1, . . . , r, and finding r eigenvalues and associated respective eigenvectors that provide the highest distance, the associated respective eigenvectors that provide the highest distance being the discriminating eigenvectors,wherein the r discriminating eigenvectors thus found form an embedded space in which processed data points with a reduced dimension r form a normal cluster,iii. detecting an abnormal data point in the processed data points with a dimension r lower than N without relying on a signature of a threat and without use of a threshold, andiv. blocking the abnormal data point.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1, further comprising receiving and processing a newly arrived N-dimensional data point into a newly-arrived data point with reduced dimension r, and wherein the detecting an abnormal data point in the processed data points with reduced dimension r includes determining that the newly-arrived data point with reduced dimension r resides outside the normal cluster.
  - 3. The method of claim 2, wherein the determining that the newly-arrived data point with reduced dimension r resides outside the normal cluster includes applying an out-of-sample extension (OOSE) procedure to the newly arrived N-dimensional data point with reduced dimension r to compute its coordinates in the embedded space and determining whether the newly arrived N-dimensional data point resides outside the normal cluster based on its computed coordinates.
  - 4. The method of claim 3, wherein the determining is based on a histogram of density values of embedded data points computed from the embedding matrix, wherein the histogram of density values is divided into bins of a given size and wherein the abnormal data point is included in a bin with the smallest size.
  - 5. The method of claim 1, wherein the detecting an abnormal data point in the processed data points with a dimension r further includes detecting the abnormal data point without tuning a method parameter.
  - 6. The method of claim 1, further comprising generating an alert on the abnormal data point.
  - 7. The method of claim 6, further comprising visualizing the abnormal data point.
  - 8. The method of claim 7, wherein the visualizing is two-dimensional or three-dimensional.
  - 9. The method of claim 1, further comprising generating an alert on the abnormal data point.
  - 10. The method of claim 9, further comprising visualizing the abnormal data point.
  - 11. The method of claim 10, wherein the visualizing is two-dimensional or three-dimensional.
  - 12. The method of claim 2, further comprising generating an alert on the abnormal data point.
  - 13. The method of claim 12, further comprising visualizing the abnormal data point.
  - 14. The method of claim 13, wherein the visualizing is two-dimensional or three-dimensional.
  - 15. The method of claim 1, wherein the traffic analyzer includes a communications network traffic analyzer.
  - 16. The method of claim 1, wherein the traffic analyzer includes a financial network traffic analyzer.

17. A system, comprising:
- a) a traffic analyzer for providing a data set of network traffic comprising N-dimensional data points, wherein N>
  
  3 and wherein the traffic analyzer is configured to generate a statistics matrix comprising the N-dimensional data points; and
  
  b) a computer program stored on a non-transitory computer readable medium, the computer program dedicated to;
  
  i. process the statistics matrix into a Markov kernel matrix,ii. process the Markov kernel matrix to obtain processed data points with a reduced dimension r lower than N by forming an embedded space defined by r discriminating eigenvectors, wherein r≦
  
  N and wherein the discriminating eigenvectors are selected by providing i=1, . . . , r eigenvalues and respective associated eigenvectors, and, for each i=2, . . . , r, by generating a respective i^thcluster based on the i^theigenvector and other clusters based on eigenvectors 1, . . . , i−
  
  1, i+1, . . . , M, by computing a distance between each i^thcluster and each of the other clusters, and by finding the r eigenvalues and their respective eigenvectors that provide the highest distance to define a normal cluster of processed data points with reduced dimension M,iii detect an abnormal data point in the processed data points with reduced dimension r without relying on a signature of a threat and without use of a threshold, andiv. block the abnormal data point.
- View Dependent Claims (18)
- - 18. The system of claim 17, wherein the computer program is further dedicated to process a newly arrived N-dimensional data point into a newly-arrived data point with reduced dimension r, and to detect an abnormal data point in the processed data points with reduced dimension r by determining that the newly-arrived data point with reduced dimension r resides outside the normal cluster.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
ThetaRay Ltd.
Original Assignee
ThetaRay Ltd.
Inventors
Averbuch, Amir, Coifman, Ronald R., David, Gil
Primary Examiner(s)
Truong, Thanhnga B
Assistant Examiner(s)
Lane, Gregory

Application Number

US14/791,269
Time in Patent Office

893 Days
Field of Search

726 23, 370252
US Class Current
CPC Class Codes

G06F 21/55   Detecting local intrusion o...

G06F 21/552   involving long-term monitor...

G06F 21/554   involving event detection a...

H04L 63/1416   Event detection, e.g. attac...

H04L 63/145   the attack involving the pr...

Anomaly detection in dynamically evolving data and systems

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Anomaly detection in dynamically evolving data and systems

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links