Apparatus method and medium for detecting payload anomaly using N-gram distribution of normal data
First Claim
1. A method for verifying a file type, the method comprising:
- receiving, using a hardware processor, a file identified as corresponding to a first file type from a first source;
generating a byte value statistical distribution of the data included in the file received from the first source;
selecting a model byte value statistical distribution representative of the first file type from model byte value statistical distributions representative of a plurality of file types;
determining a distance metric between the byte value statistical distribution of the data included in the file and the selected model byte value statistical distribution; and
verifying that a file type of the received file is the first file type based on a comparison of the distance metric to a distance metric threshold indicating that the file type of received file is the first file type.
0 Assignments
0 Petitions
Accused Products
Abstract
A method, apparatus and medium are provided for detecting anomalous payloads transmitted through a network. The system receives payloads within the network and determines a length for data contained in each payload. A statistical distribution is generated for data contained in each payload received within the network, and compared to a selected model distribution representative of normal payloads transmitted through the network. The model payload can be selected such that it has a predetermined length range that encompasses the length for data contained in the received payload. Anomalous payloads are then identified based on differences detected between the statistical distribution of received payloads and the model distribution. The system can also provide for automatic training and incremental updating of models.
78 Citations
19 Claims
-
1. A method for verifying a file type, the method comprising:
-
receiving, using a hardware processor, a file identified as corresponding to a first file type from a first source; generating a byte value statistical distribution of the data included in the file received from the first source; selecting a model byte value statistical distribution representative of the first file type from model byte value statistical distributions representative of a plurality of file types; determining a distance metric between the byte value statistical distribution of the data included in the file and the selected model byte value statistical distribution; and verifying that a file type of the received file is the first file type based on a comparison of the distance metric to a distance metric threshold indicating that the file type of received file is the first file type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for verifying a file type, the system comprising:
a hardware processor that is programmed to; receive a file identified as corresponding to a first file type from a first source; generate a byte value statistical distribution of the data included in the file received from the first source; select a model byte value statistical distribution representative of the first file type from model byte value statistical distributions representative of a plurality of file types; determine a distance metric between the byte value statistical distribution of the data included in the file and the selected model byte value statistical distribution; and verify that a file type of the received file is the first file type based on a comparison of the distance metric to a distance metric threshold indicating that the file type of received file is the first file type. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
19. A non-transitory computer-readable medium containing instructions that, when executed by a processor, cause the processor to perform a method for verifying a file type, the method comprising:
-
receiving a file identified as corresponding to a first file type from a first source; generating a byte value statistical distribution of the data included in the file received from the first source; selecting a model byte value statistical distribution representative of the first file type from model byte value statistical distributions representative of a plurality of file types; determining a distance metric between the byte value statistical distribution of the data included in the file and the selected model byte value statistical distribution; and verifying that a file type of the received file is the first file type based on a comparison of the distance metric to a distance metric threshold indicating that the file type of received file is the first file type.
-
Specification