Systems and methods for file clustering, multi-drive forensic analysis and data protection
First Claim
Patent Images
1. A multi-drive forensic data analysis system comprising:
- a plurality of memory devices having files stored thereon;
at least one module configured to receive the files stored on the plurality of memory devices and extract characteristics of the files stored on the plurality of memory devices;
a clustering module configured to;
receive the extracted characteristics;
identify similarities between the files stored on the plurality of memory devices, based on the extracted characteristics, using a two-stage algorithm wherein at least one stage of the two-stage algorithm includes content-based hashing;
generate file clusters based on the identified similarities among the files stored on the plurality of memory devices; and
generate a visual representation of the memory devices and connections therebetween based on the identified similarities among the files stored on the plurality of memory devices, the visual representation comprising;
nodes that correspond to the memory devices; and
lines connecting the nodes, each of the lines having a thickness representing the identified similarities; and
a user interface module for displaying the visual representation.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for file clustering, multi-drive forensic analysis and protection of sensitive data. Multiple memory devices can store files. A module can extract characteristics from the stored files, identify similarities between the files based on the extracted characteristics and generate file clusters based on the identified similarities. A visual representation of the file clusters, which can be generated to show the identified similarities among the files, can be displayed by a user interface module.
29 Citations
23 Claims
-
1. A multi-drive forensic data analysis system comprising:
-
a plurality of memory devices having files stored thereon; at least one module configured to receive the files stored on the plurality of memory devices and extract characteristics of the files stored on the plurality of memory devices; a clustering module configured to; receive the extracted characteristics; identify similarities between the files stored on the plurality of memory devices, based on the extracted characteristics, using a two-stage algorithm wherein at least one stage of the two-stage algorithm includes content-based hashing; generate file clusters based on the identified similarities among the files stored on the plurality of memory devices; and generate a visual representation of the memory devices and connections therebetween based on the identified similarities among the files stored on the plurality of memory devices, the visual representation comprising;
nodes that correspond to the memory devices; and
lines connecting the nodes, each of the lines having a thickness representing the identified similarities; anda user interface module for displaying the visual representation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A multi-drive forensic data analysis system comprising:
-
a plurality of memory devices having files stored thereon; a memory storing a reference set of files including one or more sensitivity designations based on one or more characteristics of the reference set of files; a processor configured to; identify and cluster the plurality of files stored on the plurality of memory devices based on the one or more characteristics between the plurality of files and one or more files from the reference set of files using a two-stage algorithm wherein at least one stage of the two-stage algorithm includes content-based hashing; tag the plurality of files stored on the plurality of memory devices with the same sensitivity designation as one or more of the files from the reference set; and generate a visual representation of the memory devices and connections therebetween based on the sensitivity designations of the files and which one of the plurality of memory devices the files are stored on, the visual representation comprising;
nodes that correspond to the memory devices; and
lines connecting the nodes, each of the lines having a thickness representing the identified similarities, anda user interface module for displaying the visual representation. - View Dependent Claims (19, 20, 21, 22, 23)
-
Specification