Method and system for learning representations for log data in cybersecurity
First Claim
1. A cybersecurity method comprising:
- forming a time based series of behavioral features comprising human engineered features by extracting at least one behavioral feature from a first set of log data retrieved over a first time segment, and extracting at least one behavioral feature from a second set of log data retrieved over a second time segment;
analyzing the time based series of behavioral features,wherein said analyzing the time based series of behavioral features comprises using a neural network based system, a dimensionality reduction system, random forest system, or combinations thereof,deriving machine learned features from said time based series of behavioral features through said analyzing the time based series of behavioral features; and
detecting an attack or threat to an enterprise or e-commerce system through said analyzing the time based series of behavioral features,wherein said detecting an attack or threat comprises determining behavioral patterns indicative of said attack or threat based on the combination of said human engineered features and said machine learned features,wherein the time based series of behavioral features is formatted into a time-based matrix, wherein each behavioral feature is associated with an entity and a time segment.
4 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a data analysis and cybersecurity method, which forms a time-based series of behavioral features, and analyzes the series of behavioral features for attack detection, new features derivation, and/or features evaluation. Analyzing the time based series of behavioral features may comprise using a Feed-Forward Neural Networks (FFNN) method, a Convolutional Neural Networks (CNN) method, a Recurrent Neural Networks (RNN) method, a Long Short-Term Memories (LSTMs) method, a principal Component Analysis (PCA) method, a Random Forest pipeline method, and/or an autoencoder method. In one embodiment, the behavioral features of the time-based series of behavioral features comprise human engineered features, and/or machined learned features, wherein the method may be used to learn new features from historic features.
15 Citations
17 Claims
-
1. A cybersecurity method comprising:
-
forming a time based series of behavioral features comprising human engineered features by extracting at least one behavioral feature from a first set of log data retrieved over a first time segment, and extracting at least one behavioral feature from a second set of log data retrieved over a second time segment; analyzing the time based series of behavioral features, wherein said analyzing the time based series of behavioral features comprises using a neural network based system, a dimensionality reduction system, random forest system, or combinations thereof, deriving machine learned features from said time based series of behavioral features through said analyzing the time based series of behavioral features; and detecting an attack or threat to an enterprise or e-commerce system through said analyzing the time based series of behavioral features, wherein said detecting an attack or threat comprises determining behavioral patterns indicative of said attack or threat based on the combination of said human engineered features and said machine learned features, wherein the time based series of behavioral features is formatted into a time-based matrix, wherein each behavioral feature is associated with an entity and a time segment. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for learning representations of log data for cyber security, the apparatus comprising:
-
one or more processors; a system memory coupled to the one or more processors; one or more non-transitory memory units coupled to the one or more processors; and features extraction codes, features formatting codes, and data analysis codes stored on the one or more non transitory memory units, that when executed by the one or more processors, are configured to perform a method, comprising; forming a time based series of behavioral features for multiple entities by extracting behavioral features from log data retrieved over a first time segment, and extracting behavioral features from log data retrieved over a second time segment, wherein said time based series of behavioral features comprises human engineered features associated with said multiple entities; and analyzing the time based series of behavioral features, wherein said analyzing the time based series of behavioral features comprises using a neural network based system, a dimensionality reduction system, random forest system, or combinations thereof, deriving machine learned features from said time based series of behavioral features through said analyzing the time based series of behavioral features; and detecting an attack or potential threat to the enterprise or e-commerce system through said analyzing the time based series of behavioral features, wherein said detecting an attack or potential threat comprises determining behavioral patterns indicative of said attack or potential threat based on the combination of said human engineered features and said machine learned features, wherein the features extraction codes are configured to extract the behavioral features by executing an activity tracking module, an activity aggregation module, or a combination thereof, wherein the time based series of behavioral features is formatted into a time based features matrix by formatting and storing the at least one or more features into the time based features matrix, wherein each feature is associated an entity and time segment. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A cybersecurity method comprising:
-
extracting at least one behavioral feature from a first set of log data retrieved over a first time segment, and extracting at least one behavioral feature from a second set of log data retrieved over a second time segment; computing, for multiple entities and over multiple time segments, one or more features from the log lines by activity tracking, activity aggregation, or a combination thereof; storing the one or more features in a time based series of behavioral features matrix, wherein for each of said entities, a set of features is stored on a per time-segment basis; analyzing the time-based series of behavioral features matrix using a neural network based system, a dimensionality reduction system, random forest system, or combinations thereof; deriving machine learned features from said time based series of behavioral features matrix via said analyzing; detecting a malicious entity by determining behavioral patterns indicative of a malicious status related to said malicious entity based on the combination of the derived machine learned features and said one or more features computed from said log lines. - View Dependent Claims (15, 16, 17)
-
Specification