System and method for using file hashes to track data leakage and document propagation in a network
First Claim
1. A system for using file hashes to track data leakage and document propagation in a network, comprising:
- one or more physical processors programmed to execute computer program instructions which, when executed, cause the physical processors to;
obtain a set of hashes that are associated with files of a user device of a set of user devices, and a reference set of hashes that are associated with files of a reference system, wherein the reference system is limited to files authorized to be on all devices of the set of user devices;
determine an additional subset of hashes included in the set of hashes and not included in the reference set of hashes based on a comparison between the set of hashes and the reference set of hashes;
classify the user device into a group based on the additional subset of hashes comprising a hash that is the same as a hash associated with a file of at least another user device classified into the group;
predict that the file associated with the same hash is exclusive for the group to which the user device is classified;
scan one or more other user devices not classified into the group to determine what files are on the other user devices;
generate an alert indicating unauthorized file access, wherein the alert is generated responsive to the scan indicating that the other user devices contain the file predicted to be exclusive for the group to which the user device is classified; and
deliver the alert to a user.
3 Assignments
0 Petitions
Accused Products
Abstract
The system and method described herein may use file hashes to track data leakage and document propagation in a network. For example, file systems associated with known reference systems and various user devices may be compared to classify the user devices into various groups based on differences between the respective file systems, identify files unique to the various groups, and detect potential data leakage or document propagation if user devices classified in certain groups include any files that are unique to other groups. Additionally, various algorithms may track locations, movements, changes, and other events that relate to normal or typical activity in the network, which may be used to generate statistics that can be compared to subsequent activities that occur in the network to detect potentially anomalous activity that may represent potential data leakage or document propagation.
128 Citations
20 Claims
-
1. A system for using file hashes to track data leakage and document propagation in a network, comprising:
one or more physical processors programmed to execute computer program instructions which, when executed, cause the physical processors to; obtain a set of hashes that are associated with files of a user device of a set of user devices, and a reference set of hashes that are associated with files of a reference system, wherein the reference system is limited to files authorized to be on all devices of the set of user devices; determine an additional subset of hashes included in the set of hashes and not included in the reference set of hashes based on a comparison between the set of hashes and the reference set of hashes; classify the user device into a group based on the additional subset of hashes comprising a hash that is the same as a hash associated with a file of at least another user device classified into the group; predict that the file associated with the same hash is exclusive for the group to which the user device is classified; scan one or more other user devices not classified into the group to determine what files are on the other user devices; generate an alert indicating unauthorized file access, wherein the alert is generated responsive to the scan indicating that the other user devices contain the file predicted to be exclusive for the group to which the user device is classified; and deliver the alert to a user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A method for using file hashes to track data leakage and document propagation in a network, the method being implemented on a computer system that includes one or more physical processors executing computer program instructions which, when executed, perform the method, the method comprising:
-
obtaining, by the physical processors, a set of hashes that are associated with files of a user device of a set of user devices, and a reference set of hashes that are associated with files of a reference system, wherein the reference system is limited to files authorized to be on all devices of the set of user devices; determining, by the physical processors, an additional subset of hashes included in the set of hashes and not included in the reference set of hashes based on a comparison between the set of hashes and the reference set of hashes; classifying, by the physical processors, the user device into a group based on the additional subset of hashes comprising a hash that is the same as a hash associated with a file of at least another user device classified into the group; predicting, by the physical processors, that the file associated with the same hash is exclusive for the group to which the user device is classified; scanning, by the physical processors, one or more other user devices not classified into the group to determine what files are on the other user devices; generating, by the physical processors, an alert indicating unauthorized file access responsive to the scan indicating that the other user devices contain the file predicted to be exclusive for the group to which the user device is classified; and delivering, by the physical processors, the alert to a user. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification