Internet traffic classification via time-frequency analysis
First Claim
1. An Internet traffic classification system comprising:
- a processor; and
a memory comprising instructions that, when executed by the processor, cause the processor to perform operations comprisingreceiving an internet traffic sequence comprising non-malicious data packets and malicious data packets,extracting, from the internet traffic sequence, a plurality of consecutive samples to be used for classification of the internet traffic sequence,converting the plurality of consecutive samples of the internet traffic sequence from a time domain to a frequency domain via a recursive discrete Fourier transform,determining whether a largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than a threshold portion of a total power spectra of the plurality of consecutive samples of the internet traffic sequence,when the largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than the threshold portion of the total power spectra,determining that the plurality of consecutive samples of the internet traffic sequence comprises a consumer traffic component, andremoving, from the plurality of consecutive samples of the internet traffic sequence, any samples of the plurality of consecutive samples corresponding to the consumer traffic component,calculating a mean and a variance of a remaining portion of the internet traffic sequence, wherein the remaining portion of the internet traffic sequence comprising the plurality of consecutive samples without any samples corresponding to the consumer traffic component,setting, based upon the mean and the variance of the remaining portion of the internet traffic sequence, a threshold for detection of machine-to-machine traffic,recording a series of time indices for samples in the remaining portion of the internet traffic sequence that are greater than the threshold for detection of machine-to-machine traffic,computing time differences between adjacent time indices within the series of time indices,creating a histogram using the time differences,counting the histogram, andwhen most occurrences in the histogram are in association with a specific time difference, determining that the remaining portion of the internet traffic sequence comprises a machine-to-machine-traffic component.
2 Assignments
0 Petitions
Accused Products
Abstract
Concepts and technologies disclosed herein are directed to internet traffic classification via time-frequency analysis. According to one aspect of the concepts and technologies disclosed herein, a security classification scheme can be implemented to identify potentially malicious activities from normal internet traffic. The security classification scheme can exploit the distinctive characteristics of different types of traffic in both frequency domain and time domain to identify four different cases. Due to the separation of different types of traffic, the security classification scheme can lower the false alarm rate and improve network security. The security classification scheme can utilize a recursive discrete Fourier transform (“DFT”) implementation to enhance computational efficiency. The security classification scheme can be deployed for real-time network traffic monitoring due to an efficient streaming design and can be effectively used to detect and predict when and where the suspicious activities occur within a monitored network.
-
Citations
18 Claims
-
1. An Internet traffic classification system comprising:
-
a processor; and a memory comprising instructions that, when executed by the processor, cause the processor to perform operations comprising receiving an internet traffic sequence comprising non-malicious data packets and malicious data packets, extracting, from the internet traffic sequence, a plurality of consecutive samples to be used for classification of the internet traffic sequence, converting the plurality of consecutive samples of the internet traffic sequence from a time domain to a frequency domain via a recursive discrete Fourier transform, determining whether a largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than a threshold portion of a total power spectra of the plurality of consecutive samples of the internet traffic sequence, when the largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than the threshold portion of the total power spectra, determining that the plurality of consecutive samples of the internet traffic sequence comprises a consumer traffic component, and removing, from the plurality of consecutive samples of the internet traffic sequence, any samples of the plurality of consecutive samples corresponding to the consumer traffic component, calculating a mean and a variance of a remaining portion of the internet traffic sequence, wherein the remaining portion of the internet traffic sequence comprising the plurality of consecutive samples without any samples corresponding to the consumer traffic component, setting, based upon the mean and the variance of the remaining portion of the internet traffic sequence, a threshold for detection of machine-to-machine traffic, recording a series of time indices for samples in the remaining portion of the internet traffic sequence that are greater than the threshold for detection of machine-to-machine traffic, computing time differences between adjacent time indices within the series of time indices, creating a histogram using the time differences, counting the histogram, and when most occurrences in the histogram are in association with a specific time difference, determining that the remaining portion of the internet traffic sequence comprises a machine-to-machine-traffic component. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage medium comprising computer-executable instructions that, when executed by a processor, cause the processor to perform operations comprising:
-
receiving an Internet traffic sequence comprising non-malicious data packets and malicious data packets, extracting, from the Internet traffic sequence, a plurality of consecutive samples to be used for classification of the Internet traffic sequence, converting the plurality of consecutive samples of the Internet traffic sequence from a time domain to a frequency domain via a recursive discrete Fourier transform, determining whether a largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than a threshold portion of a total power spectra of the plurality of consecutive samples of the Internet traffic sequence, when the largest power spectrum in the plurality of consecutive samples of the Internet traffic sequence is greater than the threshold portion of the total power spectra, determining that the plurality of consecutive samples of the Internet traffic sequence comprises a consumer traffic component, and removing, from the plurality of consecutive samples of the Internet traffic sequence, any samples of the plurality of consecutive samples corresponding to the consumer traffic component, calculating a mean and a variance of a remaining portion of the Internet traffic sequence, wherein the remaining portion of the Internet traffic sequence comprising the plurality of consecutive samples without any samples corresponding to the consumer traffic component, setting, based upon the mean and the variance of the remaining portion of the Internet traffic sequence, a threshold for detection of machine-to-machine traffic, recording a series of time indices for samples in the remaining portion of the Internet traffic sequence that are greater than the threshold for detection of machine-to-machine traffic, computing time differences between adjacent time indices within the series of time indices, creating a histogram using the time differences, counting the histogram, and when most occurrences in the histogram are in association with a specific time difference, determining that the remaining portion of the internet traffic sequence comprises a machine-to-machine-traffic component. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
receiving, by an internet traffic classification system comprising a processor, an internet traffic sequence comprising non-malicious data packets and malicious data packets; extracting, by the internet traffic classification system, from the internet traffic sequence, a plurality of consecutive samples to be used for classification of the internet traffic sequence; converting, by the internet traffic classification system, the plurality of consecutive samples of the internet traffic sequence from a time domain to a frequency domain via a recursive discrete Fourier transform; determining, by the internet traffic classification system, whether a largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than a threshold portion of a total power spectra of the plurality of consecutive samples of the internet traffic sequence; when the largest power spectrum in the plurality of consecutive samples of the internet traffic sequence is greater than the threshold portion of the total power spectra, determining, by the internet traffic classification system, that the plurality of consecutive samples of the internet traffic sequence comprises a consumer traffic component, and removing, by the internet traffic classification system, from the plurality of consecutive samples of the internet traffic sequence, any samples of the plurality of consecutive samples corresponding to the consumer traffic component; calculating, by the internet traffic classification system, a mean and a variance of a remaining portion of the internet traffic sequence, wherein the remaining portion of the internet traffic sequence comprising the plurality of consecutive samples without any samples corresponding to the consumer traffic component; setting, by the internet traffic classification system, based upon the mean and the variance of the remaining portion of the internet traffic sequence, a threshold for detection of machine-to-machine traffic; recording, by the internet traffic classification system, a series of time indices for samples in the remaining portion of the internet traffic sequence that are greater than the threshold for detection of machine-to-machine traffic; computing, by the internet traffic classification system, time differences between adjacent time indices within the series of time indices; creating, by the internet traffic classification system, a histogram using the time differences; counting, by the internet traffic classification system, the histogram; and when most occurrences in the histogram are in association with a specific time difference, determining, by the internet traffic classification system, that the remaining portion of the internet traffic sequence comprises a machine-to-machine-traffic component. - View Dependent Claims (15, 16, 17, 18)
-
Specification