Apparatus method and medium for tracing the origin of network transmissions using n-gram distribution of data
First Claim
1. A method of tracing the location of an origin computer system that initially transmits a suspect data payload across a computer network to an end target computer system, the computer network having a plurality of computer systems, where each of the computer systems maintains connection records of transmitted data it receives, the transmitted data and connection records including a previous computer system address, a data payload, and a next computer system address, the method comprising:
- creating, at each of the computer systems, a connection record for each transmission received from another computer system through the computer network;
generating and storing a plurality of statistical distributions, wherein each of the plurality of statistical distributions is a byte frequency distribution of data contained in the data payload in the connection record for each transmission received from another computer system through the computer network;
selecting a model byte value statistical distribution from a plurality of model byte frequency statistical distributions based on the length of the data contained in the data payload, wherein the model byte frequency statistical distribution is representative of normal payloads transmitted through the computer network;
identifying the suspect data payload at the end target computer system based at least in part on differences detected between the statistical distribution associated with the suspect data payload and the selected model byte value statistical distribution and generating a suspect byte frequency distribution of data contained in the suspect data payload;
setting the end target computer system as the suspect computer system;
comparing the suspect byte frequency distribution of the data contained in the suspect data payload to the plurality of statistical distributions associated with the connection records;
upon finding at least one of the plurality of statistical distributions that is similar to the suspect byte frequency distribution of the data contained in the suspect data payload, determining the previous computer system address associated with the at least one of the plurality of statistical distributions;
setting the computer system associated with the previous computer system address as the suspect computer system; and
repeating the comparing, the determining, and the setting until the origin computer system is determined.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, apparatus, and medium are provided for tracing the origin of network transmissions. Connection records are maintained at computer system for storing source and destination addresses. The connection records also maintain a statistical distribution of data corresponding to the data payload being transmitted. The statistical distribution can be compared to that of the connection records in order to identify the sender. The location of the sender can subsequently be determined from the source address stored in the connection record. The process can be repeated multiple times until the location of the original sender has been traced.
61 Citations
17 Claims
-
1. A method of tracing the location of an origin computer system that initially transmits a suspect data payload across a computer network to an end target computer system, the computer network having a plurality of computer systems, where each of the computer systems maintains connection records of transmitted data it receives, the transmitted data and connection records including a previous computer system address, a data payload, and a next computer system address, the method comprising:
-
creating, at each of the computer systems, a connection record for each transmission received from another computer system through the computer network; generating and storing a plurality of statistical distributions, wherein each of the plurality of statistical distributions is a byte frequency distribution of data contained in the data payload in the connection record for each transmission received from another computer system through the computer network; selecting a model byte value statistical distribution from a plurality of model byte frequency statistical distributions based on the length of the data contained in the data payload, wherein the model byte frequency statistical distribution is representative of normal payloads transmitted through the computer network; identifying the suspect data payload at the end target computer system based at least in part on differences detected between the statistical distribution associated with the suspect data payload and the selected model byte value statistical distribution and generating a suspect byte frequency distribution of data contained in the suspect data payload; setting the end target computer system as the suspect computer system; comparing the suspect byte frequency distribution of the data contained in the suspect data payload to the plurality of statistical distributions associated with the connection records; upon finding at least one of the plurality of statistical distributions that is similar to the suspect byte frequency distribution of the data contained in the suspect data payload, determining the previous computer system address associated with the at least one of the plurality of statistical distributions; setting the computer system associated with the previous computer system address as the suspect computer system; and repeating the comparing, the determining, and the setting until the origin computer system is determined. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for tracing the location of an origin computer system that initially transmits a suspect data payload across a computer network to an end target computer system, comprising:
- at least one computer system connected to said computer network for providing network service to one or more users, each said at least one computer systems maintaining connection records of transmitted data it receives, whereby the connection records include a previous computer system address, a data payload, and a next computer system address;
said at least one computer system being configured to;create a connection record for each transmission received from another computer system through said computer network; generate and store a plurality of statistical distributions, wherein each of the plurality of statistical distributions is a byte frequency distribution of data contained in the data payload in the connection record for each transmission received from another computer system through the computer network; select a model byte value statistical distribution from a plurality of model byte frequency statistical distributions based on the length of the data contained in the data payload, wherein the model byte frequency statistical distribution is representative of normal payloads transmitted through the computer network; identify the suspect data payload at the end target computer system based at least in part on differences detected between the statistical distribution associated with the suspect data payload and the selected model byte value statistical distribution and generate a suspect byte frequency distribution of data contained in the suspect data payload; set the end target computer system as the suspect computer system; compare the suspect byte frequency distribution of the data contained in the suspect data payload to the plurality of statistical distributions associated with the connection records upon finding at least one of the plurality of statistical distributions that is similar to the suspect byte frequency distribution of the data contained in the suspect data payload, determine the previous computer system address associated with the at least one of the plurality of statistical distributions; set the computer system associated with the previous computer system address as the suspect computer system; and repeat the comparison, the determination, and the setting until the origin computer system is determined.
- at least one computer system connected to said computer network for providing network service to one or more users, each said at least one computer systems maintaining connection records of transmitted data it receives, whereby the connection records include a previous computer system address, a data payload, and a next computer system address;
-
16. A system for tracing the location of an origin computer system that initially transmits a suspect data payload across a computer network to an end target computer system, comprising:
-
at least one computer system connected to said computer network for providing network service to one or more users, each said at least one computer systems maintaining connection records of transmitted data it receives, whereby the connection records include a previous computer system address, a data payload, and a next computer system address; means for creating, at each of the computer systems, a connection record for each transmission received from another computer system through the computer network; means for generating and storing a plurality of statistical distributions, wherein each of the plurality of statistical distributions is a byte frequency distribution of data contained in the data payload in the connection record for each transmission received from another computer system through the computer network; means for selecting a model byte value statistical distribution from a plurality of model byte frequency statistical distributions based on the length of the data contained in the data payload, wherein the model byte frequency statistical distribution is representative of normal payloads transmitted through the computer network; means for identifying the suspect data payload at the end target computer system based at least in part on differences detected between the statistical distribution associated with the suspect data payload and the selected model byte value statistical distribution and generating a suspect byte frequency distribution of data contained in the suspect data payload; means for setting the end target computer system as the suspect computer system; means for comparing the suspect byte frequency distribution of the data contained in the suspect data payload to the plurality of statistical distributions associated with the connection records; upon finding at least one of the plurality of statistical distributions that is similar to the suspect byte frequency distribution of the data contained in the suspect data payload, means for determining the previous computer system address associated with the at least one of the plurality of statistical distributions; means for setting the computer system associated with the previous computer system address as the suspect computer system; and means for repeating the comparison, the determination, and the setting until the origin computer system is determined.
-
-
17. A non-transitory computer readable medium carrying instructions executable by a computer for tracing the location of an origin computer system that initially transmits a suspect data payload across a computer network to an end target computer system, said instructions causing said computer to perform the acts of:
-
creating, at each of a plurality of computer systems within said computer network, a connection record for each transmission received from another computer system through the computer network, wherein each connection records including a previous computer system address, a data payload, and a next computer system address; generating and storing a plurality of statistical distributions, wherein each of the plurality of statistical distributions is a byte frequency distribution of data contained in data payload in the connection record for each transmission received from another computer system through the computer network; selecting a model byte value statistical distribution from a plurality of model byte frequency statistical distributions based on the length of the data contained in the data payload wherein the model byte frequency statistical distribution is representative of normal payloads transmitted through the computer network; identifying the suspect data payload at the end target computer system based at least in part of differences detected between the statistical distribution associated with the suspect data payload and the selected model byte value statistical distribution and generating a suspect byte frequency distribution of data contained in the suspect data payload setting the end target computer system as the suspect computer system; comparing the suspect byte frequency distribution of the data contained in the suspect data payload to the plurality of statistical distributions associated with the connection records; upon finding at least one of the plurality of statistical distribution that is similar to the suspect byte frequency distribution of the data contained in the suspect data payload, determining the previous computer system address associated with the of the plurality of statistical distribution; setting the computer system associated with the previous computer system address as the suspect computer system; and repeating (5)-(7) until the origin computer system is determined.
-
Specification