Class discovery for automated discovery, attribution, analysis, and risk assessment of security threats

US 8,418,249 B1
Filed: 11/10/2011
Issued: 04/09/2013
Est. Priority Date: 11/10/2011
Status: Active Grant

First Claim

Patent Images

1. A method for profiling network traffic of a network, comprising:

obtaining a training dataset having n entries each comprising a plurality of feature values and a ground truth class label, wherein the plurality of feature values correspond to a plurality of features of a historical flow in the network traffic, wherein the historical flow is tagged with the ground truth class label based on data characteristics associated with a corresponding application executing in the network;

constructing a ground truth kernel in a n×

n matrix format by self multiplication of a ground truth class label vector, wherein the ground truth class label vector comprises n ground truth class labels each from one of the n entries in the training dataset;

generating n initial boosting weights each corresponding to one of the n entries in the training dataset, wherein each of the n initial boosting weights represents estimated importance of a corresponding one of the n entries;

generating, by a processor of a computer system, a first decision tree from the training dataset based on a decision tree learning algorithm using the n initial boosting weights, wherein the first decision tree maps each entry of the training dataset to a corresponding one in n first predicted class labels based on the plurality of feature values in the each entry, wherein a first predicted class label vector comprises the n first predicted class labels mapped by the first decision tree to the n entries in the training dataset;

adjusting the n initial boosting weights to generate n adjusted boosting weights by comparing corresponding matrix elements between the ground truth kernel and a first kernel constructed by self multiplication of the first predicted class label vector, wherein a first matrix element mismatch increases the importance of the corresponding one of the n entries where the first matrix element mismatch occurs;

generating, by the processor, a second decision tree from the training dataset based on the decision tree learning algorithm using the n adjusted boosting weights, wherein the second decision tree maps the each entry of the training dataset to a second predicted class label based on the plurality of feature values in the each entry, wherein a second predicted class label vector comprises n second predicted class labels mapped by the second decision tree to the n entries in the training dataset;

generating, by the processor, a behavioral model based at least on the first predicted class label vector and the second predicted class label vector; and

determining a class label for a new flow in the network traffic based on whether the new flow matches the behavioral model.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for profiling network traffic of a network. The method includes obtaining a signature library comprising a plurality of signatures corresponding to a plurality of behavioral models, generating, based on a first pre-determined criterion, a group behavioral model associated with the signature library, wherein the group behavioral model represents a common behavior of a plurality of historical flows identified from the network traffic, wherein each of the plurality of signatures correlates to a subset of the plurality of historical flows, selecting a flow in the network traffic for including in a target flow set, wherein the flow matches the group behavioral model without matching any of the plurality of behavioral models, analyzing the target flow set to generate a new signature, and adding the new signature to the signature library. Further, each behavioral model is generated from a kernel constructed using boosting of decision tree learning methods.

Citations

20 Claims

1. A method for profiling network traffic of a network, comprising:
- obtaining a training dataset having n entries each comprising a plurality of feature values and a ground truth class label, wherein the plurality of feature values correspond to a plurality of features of a historical flow in the network traffic, wherein the historical flow is tagged with the ground truth class label based on data characteristics associated with a corresponding application executing in the network;
  
  constructing a ground truth kernel in a n×
  
  n matrix format by self multiplication of a ground truth class label vector, wherein the ground truth class label vector comprises n ground truth class labels each from one of the n entries in the training dataset;
  
  generating n initial boosting weights each corresponding to one of the n entries in the training dataset, wherein each of the n initial boosting weights represents estimated importance of a corresponding one of the n entries;
  
  generating, by a processor of a computer system, a first decision tree from the training dataset based on a decision tree learning algorithm using the n initial boosting weights, wherein the first decision tree maps each entry of the training dataset to a corresponding one in n first predicted class labels based on the plurality of feature values in the each entry, wherein a first predicted class label vector comprises the n first predicted class labels mapped by the first decision tree to the n entries in the training dataset;
  
  adjusting the n initial boosting weights to generate n adjusted boosting weights by comparing corresponding matrix elements between the ground truth kernel and a first kernel constructed by self multiplication of the first predicted class label vector, wherein a first matrix element mismatch increases the importance of the corresponding one of the n entries where the first matrix element mismatch occurs;
  
  generating, by the processor, a second decision tree from the training dataset based on the decision tree learning algorithm using the n adjusted boosting weights, wherein the second decision tree maps the each entry of the training dataset to a second predicted class label based on the plurality of feature values in the each entry, wherein a second predicted class label vector comprises n second predicted class labels mapped by the second decision tree to the n entries in the training dataset;
  
  generating, by the processor, a behavioral model based at least on the first predicted class label vector and the second predicted class label vector; and
  
  determining a class label for a new flow in the network traffic based on whether the new flow matches the behavioral model.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, further comprising:
    - generating, based on a first measure of mismatch between the ground truth kernel and the first kernel, a first boosting weight representing estimated importance of the first predicted class label vector;
      
      generating, based on a second measure of mismatch between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, a second boosting weight representing estimated importance of the second predicted class label vector; and
      
      generating a cumulative kernel by summing the first kernel and the second kernel based on the first boosting weight and the second boosting weight;
      
      wherein the behavioral model is generated from the cumulative kernel.
  - 3. The method of claim 1, further comprising:
    - adjusting the n adjusted boosting weights to generate n further adjusted boosting weights by comparing corresponding matrix elements between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, wherein a second matrix element mismatch increases the importance of the corresponding one of the n entries where the second matrix element mismatch occurs; and
      
      generating, by the processor, a third decision tree from the training dataset based on the decision tree learning algorithm using the n further adjusted boosting weights, wherein the third decision tree maps the each entry of the training dataset to a third predicted class label based on the plurality of feature values in the each entry, wherein a third predicted class label vector comprises n third predicted class labels mapped by the third decision tree to the n entries in the training dataset,wherein generating the behavioral model is further based at least on the third predicted class label vector.
  - 4. The method of claim 1, wherein the behavioral model comprises a support vector machine (SVM).
  - 5. The method of claim 1, further comprising:
    - obtaining a signature library comprising a plurality of signatures corresponding to a plurality of behavioral models comprising the behavioral model;
      
      generating, based on a pre-determined criterion, a group behavioral model associated with the signature library, wherein the group behavioral model represents a common behavior of a plurality of historical flows identified from the network traffic, wherein each of the plurality of signatures correlates to a subset of the plurality of historical flows;
      
      selecting a flow in the network traffic for including in a target flow set, wherein the flow matches the group behavioral model without matching any of the plurality of behavioral models;
      
      analyzing the target flow set to generate a new signature; and
      
      adding the new signature to the signature library.
  - 6. The method of claim 5, further comprising:
    - updating the group behavioral model and the target flow set in response to adding the new signature to the signature library.
  - 7. The method of claim 5,wherein each of the plurality of signatures, and a corresponding behavioral model thereof, are associated with a malicious activity, andwherein the group behavioral model comprises a threat model.

8. A system for profiling network traffic of a network, comprising:
- a processor;
  
  memory storing instructions, when executed by the processor comprising functionality for;
  
  obtaining a training dataset having n entries each comprising a plurality of feature values and a ground truth class label, wherein the plurality of feature values correspond to a plurality of features of a historical flow in the network traffic, wherein the historical flow is tagged with the ground truth class label based on data characteristics associated with a corresponding application executing in the network;
  
  constructing a ground truth kernel in a n×
  
  n matrix format by self multiplication of a ground truth class label vector, wherein the ground truth class label vector comprises n ground truth class labels each from one of the n entries in the training dataset;
  
  generating n initial boosting weights each corresponding to one of the n entries in the training dataset, wherein each of the n initial boosting weights represents estimated importance of a corresponding one of the n entries;
  
  generating a first decision tree from the training dataset based on a decision tree learning algorithm using the n initial boosting weights, wherein the first decision tree maps each entry of the training dataset to a corresponding one in n first predicted class labels based on the plurality of feature values in the each entry, wherein a first predicted class label vector comprises the n first predicted class labels mapped by the first decision tree to the n entries in the training dataset;
  
  adjusting the n initial boosting weights to generate n adjusted boosting weights by comparing corresponding matrix elements between the ground truth kernel and a first kernel constructed by self multiplication of the first predicted class label vector, wherein a first matrix element mismatch increases the importance of the corresponding one of the n entries where the first matrix element mismatch occurs;
  
  generating a second decision tree from the training dataset based on the decision tree learning algorithm using the n adjusted boosting weights, wherein the second decision tree maps the each entry of the training dataset to a second predicted class label based on the plurality of feature values in the each entry, wherein a second predicted class label vector comprises n second predicted class labels mapped by the second decision tree to the n entries in the training dataset;
  
  generating a behavioral model based at least on the first predicted class label vector and the second predicted class label vector; and
  
  determining a class label for a new flow in the network traffic based on whether the new flow matches the behavioral model.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, the instruction when executed by the processor further comprising functionality for:
    - generating, based on a first measure of mismatch between the ground truth kernel and the first kernel, a first boosting weight representing estimated importance of the first predicted class label vector;
      
      generating, based on a second measure of mismatch between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, a second boosting weight representing estimated importance of the second predicted class label vector; and
      
      generating a cumulative kernel by summing the first kernel and the second kernel based on the first boosting weight and the second boosting weight;
      
      wherein the behavioral model is generated from the cumulative kernel.
  - 10. The system of claim 8, the instruction when executed by the processor further comprising functionality for:
    - adjusting the n adjusted boosting weights to generate n further adjusted boosting weights by comparing corresponding matrix elements between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, wherein a second matrix element mismatch increases the importance of the corresponding one of the n entries where the second matrix element mismatch occurs; and
      
      generating, by the processor, a third decision tree from the training dataset based on the decision tree learning algorithm using the n further adjusted boosting weights, wherein the third decision tree maps the each entry of the training dataset to a third predicted class label based on the plurality of feature values in the each entry, wherein a third predicted class label vector comprises n third predicted class labels mapped by the third decision tree to the n entries in the training dataset,wherein generating the behavioral model is further based at least on the third predicted class label vector.
  - 11. The system of claim 8, wherein the behavioral model comprises a support vector machine (SVM).
  - 12. The system of claim 8,a signature library comprising a plurality of signatures corresponding to a plurality of behavioral models comprising the behavioral model;
    - a statistical model generator executing on the processor and configured to generate, based on a pre-determined criterion, a group behavioral model associated with the signature library, wherein the group behavioral model represents a common behavior of a plurality of historical flows identified from the network traffic, wherein each of the plurality of signatures correlates to a subset of the plurality of historical flows;
      
      a statistical classifier executing on the processor and configured to select a flow in the network traffic for including in a target flow set, wherein the flow matches the group behavioral model without matching any of the plurality of behavioral models; and
      
      a signature generator executing on the processor and configured to;
      
      analyze the target flow set to generate a new signature; and
      
      add the new signature to the signature library.
  - 13. The system of claim 12,wherein the statistical model generator is further configured to update the group behavioral model in response to adding the new signature to the signature library, andwherein the statistical classifier is further configured to update the target flow set in response to adding the new signature to the signature library.
  - 14. The system of claim 12,wherein each first data characteristics of the plurality of signatures is associated with a malicious activity generated by the corresponding application, andwherein the group behavioral model comprises a threat model.

15. A non-transitory computer readable medium, embodying instructions to profile network traffic of a network, the instructions when executed by the computer comprising functionality for:
- obtaining a training dataset having n entries each comprising a plurality of feature values and a ground truth class label, wherein the plurality of feature values correspond to a plurality of features of a historical flow in the network traffic, wherein the historical flow is tagged with the ground truth class label based on data characteristics associated with a corresponding application executing in the network;
  
  constructing a ground truth kernel in a n×
  
  n matrix format by self multiplication of a ground truth class label vector, wherein the ground truth class label vector comprises n ground truth class labels each from one of the n entries in the training dataset;
  
  generating n initial boosting weights each corresponding to one of the n entries in the training dataset, wherein each of the n initial boosting weights represents estimated importance of a corresponding one of the n entries;
  
  generating a first decision tree from the training dataset based on a decision tree learning algorithm using the n initial boosting weights, wherein the first decision tree maps each entry of the training dataset to a corresponding one in n first predicted class labels based on the plurality of feature values in the each entry, wherein a first predicted class label vector comprises the n first predicted class labels mapped by the first decision tree to the n entries in the training dataset;
  
  adjusting the n initial boosting weights to generate n adjusted boosting weights by comparing corresponding matrix elements between the ground truth kernel and a first kernel constructed by self multiplication of the first predicted class label vector, wherein a first matrix element mismatch increases the importance of the corresponding one of the n entries where the first matrix element mismatch occurs;
  
  generating a second decision tree from the training dataset based on the decision tree learning algorithm using the n adjusted boosting weights, wherein the second decision tree maps the each entry of the training dataset to a second predicted class label based on the plurality of feature values in the each entry, wherein a second predicted class label vector comprises n second predicted class labels mapped by the second decision tree to the n entries in the training dataset;
  
  generating a behavioral model based at least on the first predicted class label vector and the second predicted class label vector; and
  
  determining a class label for a new flow in the network traffic based on whether the new flow matches the behavioral model.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The non-transitory computer readable medium of claim 15, the instructions when executed by the computer further comprising functionality for:
    - generating, based on a first measure of mismatch between the ground truth kernel and the first kernel, a first boosting weight representing estimated importance of the first predicted class label vector;
      
      generating, based on a second measure of mismatch between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, a second boosting weight representing estimated importance of the second predicted class label vector; and
      
      generating a cumulative kernel by summing the first kernel and the second kernel based on the first boosting weight and the second boosting weight;
      
      wherein the behavioral model is generated from the cumulative kernel.
  - 17. The non-transitory computer readable medium of claim 15, the instructions when executed by the computer further comprising functionality for:
    - adjusting the n adjusted boosting weights to generate n further adjusted boosting weights by comparing corresponding matrix elements between the first kernel and a second kernel constructed by self multiplication of the second predicted class label vector, wherein a second matrix element mismatch increases the importance of the corresponding one of the n entries where the second matrix element mismatch occurs; and
      
      generating a third decision tree from the training dataset based on the decision tree learning algorithm using the n further adjusted boosting weights, wherein the third decision tree maps the each entry of the training dataset to a third predicted class label based on the plurality of feature values in the each entry, wherein a third predicted class label vector comprises n third predicted class labels mapped by the third decision tree to the n entries in the training dataset,wherein generating the behavioral model is further based at least on the third predicted class label vector.
  - 18. The non-transitory computer readable medium of claim 15, wherein the behavioral model comprises a support vector machine (SVM).
  - 19. The non-transitory computer readable medium of claim 15, the instructions when executed by the computer further comprising functionality for:
    - obtaining a signature library comprising a plurality of signatures corresponding to a plurality of behavioral models comprising the behavioral model;
      
      generating, based on a pre-determined criterion, a group behavioral model associated with the signature library, wherein the group behavioral model represents a common behavior of a plurality of historical flows identified from the network traffic, wherein each of the plurality of signatures correlates to a subset of the plurality of historical flows;
      
      selecting a flow in the network traffic for including in a target flow set, wherein the flow matches the group behavioral model without matching any of the plurality of behavioral models;
      
      analyzing the target flow set to generate a new signature;
      
      adding the new signature to the signature library; and
      
      updating the group behavioral model and the target flow set in response to adding the new signature to the signature library.
  - 20. The non-transitory computer readable medium of claim 19, the instructions when executed by the computer further comprising functionality for:
    - wherein each of the plurality of signatures, and a corresponding behavioral model thereof, are associated with a malicious activity, andwherein the group behavioral model comprises a threat model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Boeing Co.
Original Assignee
Narus, Inc. (Gen Digital Inc.)
Inventors
Nucci, Antonio, Comar, Prakash Mandayam, Liu, Lei, Saha, Sabyasachi
Primary Examiner(s)
Colin, Carl
Assistant Examiner(s)
ZAIDI, SYED A

Application Number

US13/293,986
Time in Patent Office

516 Days
Field of Search

726/23
US Class Current

726/23
CPC Class Codes

G06F 21/552   involving long-term monitor...

G06F 21/564   by virus signature recognition

G06F 21/577   Assessing vulnerabilities a...

H04L 63/1416   Event detection, e.g. attac...

Class discovery for automated discovery, attribution, analysis, and risk assessment of security threats

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Class discovery for automated discovery, attribution, analysis, and risk assessment of security threats

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links