Malware detection via reputation system

US 8,719,939 B2
Filed: 01/26/2010
Issued: 05/06/2014
Est. Priority Date: 12/31/2009
Status: Active Grant

First Claim

Patent Images

1. A method of filtering digital electronic content, comprising:

accessing a digital file;

extracting a plurality of high level features from the digital file;

evaluating the plurality of high level features using a classifier on a first computer system to make an initial determination of whether the digital file is benign or malicious, the classifier on the first computer system using a first classification model;

sending a hash of the digital file over a network to a reputation server computerized system for the reputation server to make a secondary determination of whether the digital file is benign or malicious, the secondary determination using a second classification model, wherein the reputation server tracks one or more characteristics of the hash of the digital file, the one or more characteristics comprising query volume per hash, time since first appearance of the hash, number of clients querying the hash, and distribution of clients querying the hash; and

receiving at the first computer system from the reputation server an indication of the secondary determination, wherein the secondary determination is made after the initial determination, wherein the first classification model has a higher false positive rate than the second classification model.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer network device receives a digital file and extracts a plurality of high level features from the file. The plurality of high level features are evaluated using a classifier to determine whether the file is benign or malicious. The file is forwarded to a requesting computer if the file is determined to be benign, and blocked if the file is determined to be malicious.

55 Citations

View as Search Results

25 Claims

1. A method of filtering digital electronic content, comprising:
- accessing a digital file;
  
  extracting a plurality of high level features from the digital file;
  
  evaluating the plurality of high level features using a classifier on a first computer system to make an initial determination of whether the digital file is benign or malicious, the classifier on the first computer system using a first classification model;
  
  sending a hash of the digital file over a network to a reputation server computerized system for the reputation server to make a secondary determination of whether the digital file is benign or malicious, the secondary determination using a second classification model, wherein the reputation server tracks one or more characteristics of the hash of the digital file, the one or more characteristics comprising query volume per hash, time since first appearance of the hash, number of clients querying the hash, and distribution of clients querying the hash; and
  
  receiving at the first computer system from the reputation server an indication of the secondary determination, wherein the secondary determination is made after the initial determination, wherein the first classification model has a higher false positive rate than the second classification model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of filtering digital electronic content of claim 1, wherein the reputation server secondary determination as to whether the digital file is benign or malicious comprises at least one of determining the digital file is malicious if the hash matches a known malicious file, and determining the digital file is malicious if the hash does not match a known benign file.
  - 3. The method of filtering digital electronic content of claim 1, wherein the classifier comprises one or more decision trees.
  - 4. The method of filtering digital electronic content of claim 1, wherein the plurality of high level features comprise at least one of file size, file randomness, starting code string, and file geometry.
  - 5. The method of filtering digital electronic content of claim 1, further comprising evaluating the digital file using behavioral data extracted from run-time properties of the digital file to determine whether the digital file is benign or malicious.
  - 6. The method of filtering digital electronic content of claim 1, wherein evaluating comprises determining at least one of libraries or resources used by the digital file.
  - 7. The method of filtering digital electronic content of claim 1, wherein at least one of the extracting and evaluating is implemented in one or more of a client computer, a gateway device, a backend server, and a real-time in-the-cloud classification system.
  - 8. The method of filtering digital electronic content of claim 1, further comprising forwarding the digital file to a requesting computer if the digital file is determined to be benign, and blocking delivery of the digital file if the digital file is determined to be malicious.
  - 9. The method of filtering digital electronic content of claim 1, wherein evaluating comprises forwarding high level features of the digital file to the reputation server for the secondary determination and blocking only files determined malicious by the reputation server.

10. A computer network device, comprising:
- a network connection operable to access a digital file;
  
  an extraction module operable to extract a plurality of high level features from the digital file; and
  
  an evaluation module operable to evaluate the plurality of high level features using a classifier to make an initial determination of whether the digital file is benign or malicious, the classifier using a first classification model;
  
  a transmission function operable to send a hash of the digital file over the network connection to a reputation server computerized system for the reputation server to make a secondary determination of whether the digital file is benign or malicious, the secondary determination using a second classification model, wherein the reputation server tracks one or more characteristics of the hash of the digital file, the one or more characteristics comprising query volume per hash, time since first appearance of the hash, number of clients querying the hash, and distribution of clients querying the hash; and
  
  a reception function operable to receive from the reputation server an indication of the secondary determination, wherein the secondary determination is made after the initial determination,wherein the first classification model has a higher false positive rate than the second classification model.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. The computer network device of claim 10, wherein the classifier comprises a decision tree.
  - 12. The computer network device of claim 10, wherein the plurality of high level features comprise at least one of file size, file randomness, starting code string, and file geometry.
  - 13. The computer network device of claim 10, wherein the evaluation module further operable to evaluate the binary file using behavioral data extracted from run-time properties of the digital file to determine whether the digital file is benign or malicious.
  - 14. The computer network device of claim 10, wherein the evaluation module is further operable to determine at least one of libraries or resources used by the digital file.
  - 15. The computer network device of claim 10, wherein the device comprises one or more of a client computer, a gateway device, a backend server, and a real-time cloud classification system.
  - 16. The computer network device of claim 10, wherein the evaluation module is further operable to forward the digital file to a requesting computer if the digital file is determined to be benign, and to block file delivery if the digital file is determined to be malicious.
  - 17. The computer network device of claim 10, wherein the evaluation module is further operable to:
    - block only digital files determined malicious by the reputation server.

18. A non-transitory machine-readable medium with instructions stored thereon, the instructions when executed operable to cause a computerized system to:
- access a digital file;
  
  extract a plurality of high level features from the digital file; and
  
  evaluate the plurality of high level features using a classifier on a first computer system to make an initial determination of whether the digital file is benign or malicious, the classifier on the first computer system using a first classification model;
  
  send a hash of the digital file over a network to a reputation server computerized system for the reputation server to make a secondary determination of whether the digital file is benign or malicious, the secondary determination using a second classification model, wherein the reputation server tracks one or more characteristics of the hash of the digital file, the one or more characteristics comprising query volume per hash, time since first appearance of the hash, number of clients querying the hash, and distribution of clients querying the hash; and
  
  receive at the first computer system from the reputation server an indication of the secondary determination, wherein the secondary determination is made after the initial determination,wherein the first classification model has a higher false positive rate than the second classification model.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
- - 19. The non-transitory machine-readable medium of claim 18, wherein the classifier comprises a decision tree.
  - 20. The non-transitory machine-readable medium of claim 18, wherein the plurality of high level features comprise at least one of file size, file randomness, starting code string, and file geometry.
  - 21. The non-transitory machine-readable medium of claim 18, wherein the instructions when executed further operable to evaluate the digital file using behavioral data extracted from runtime properties of the digital file to determine whether the digital file is benign or malicious.
  - 22. The non-transitory machine-readable medium of claim 18, wherein evaluating comprises determining at least one of libraries or resources used by the digital file.
  - 23. The non-transitory machine-readable medium of claim 18, wherein at least one of the extracting and evaluating is implemented in one or more of a client computer, a gateway device, a backend server, and a real-time cloud classification system.
  - 24. The non-transitory machine-readable medium of claim 18, wherein the instructions when executed further operable to forward the digital file to a requesting computer if the digital file is determined to be benign, and blocking file delivery if the digital file is determined to be malicious.
  - 25. The non-transitory machine-readable medium of claim 18, wherein evaluating the plurality of high level features comprises blocking only those digital files determined malicious by the reputation server.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
McAfee, LLC
Original Assignee
McAfee, Inc. (McAfee, LLC)
Inventors
Krasser, Sven, Tang, Yuchun, He, Yuanchen, Zhong, Zhenyu
Primary Examiner(s)
Zee, Edward

Application Number

US12/693,765
Publication Number

US 20110162070A1
Time in Patent Office

1,561 Days
Field of Search

726 24- 25, 726/22
US Class Current

726/24
CPC Class Codes

G06F 21/56 Computer malware detection ...

G06F 21/564 by virus signature recognition

Malware detection via reputation system

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

55 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Malware detection via reputation system

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links