Identifying and characterizing electronic files using a two-stage calculation
First Claim
1. A method comprising:
- scanning a file directory to identify one or more suspect files stored in a file database;
calculating, by a first computing device, a first value corresponding to a file in the file directory, wherein the first value is calculated using a function on a first portion of the first file that is smaller than an entire size of the file;
the first computing device making an initial determination, based on the first value matching a particular value in a suspect files database, as to whether the file is a suspect file;
if the initial determination determines that the file is a suspect file;
the first computing device calculating a second value corresponding to the file, wherein the second value is calculated using the function on a second portion of the file that is larger than the first portion but smaller than the entire size of the file;
in response to the first computing device determining that the second value matches another particular value stored in the suspect files database;
the first computing device storing information identifying the file as a suspect file on a computer readable storage medium;
the first computing device generating a report that is based, at least in part, upon the identification of the file as a suspect file; and
the first computing device at least assisting in taking at least one remedial measure related to the file; and
if the initial determination determines that the file is not a suspect file;
the first computing device identifying the file as a suspect file based on a review of the file and storing the first value in the suspect files database.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer system includes a server having a memory connected thereto. The server is adapted to be connected to a network to permit remote storage and retrieval of data files from the memory. A file identification application is operative with the server to identify errant files stored in the memory. The file identification application provides the functions of: (1) selecting a file stored in said memory; (2) generating a unique checksum corresponding to the stored fire; (3) comparing said unique checksum to each of a plurality of previously generated checksums, wherein the plurality of previously generated checksums correspond to known errant files; and (4) marking the file for deletion from the memory if the unique checksum matches one of the plurality of previously generated checksums.
38 Citations
21 Claims
-
1. A method comprising:
-
scanning a file directory to identify one or more suspect files stored in a file database; calculating, by a first computing device, a first value corresponding to a file in the file directory, wherein the first value is calculated using a function on a first portion of the first file that is smaller than an entire size of the file; the first computing device making an initial determination, based on the first value matching a particular value in a suspect files database, as to whether the file is a suspect file; if the initial determination determines that the file is a suspect file; the first computing device calculating a second value corresponding to the file, wherein the second value is calculated using the function on a second portion of the file that is larger than the first portion but smaller than the entire size of the file; in response to the first computing device determining that the second value matches another particular value stored in the suspect files database; the first computing device storing information identifying the file as a suspect file on a computer readable storage medium; the first computing device generating a report that is based, at least in part, upon the identification of the file as a suspect file; and the first computing device at least assisting in taking at least one remedial measure related to the file; and if the initial determination determines that the file is not a suspect file; the first computing device identifying the file as a suspect file based on a review of the file and storing the first value in the suspect files database. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer-readable medium having stored thereon instructions that are executable by a processor to cause a first computer system to perform operations comprising:
-
scanning a file directory to identify one or more suspect files stored in a file database; calculating a first checksum of a file in the file directory based on a first portion of the file that is smaller than an entire size of the file; making an initial determination, based on the calculated first checksum, as to whether the first portion of the file has data corresponding to one or more particular characteristics; if the initial determination determines that the first portion of the file has data corresponding to one or more particular characteristics; calculating a second checksum on the file, wherein the second checksum is calculated on a second portion of the file that is larger than the first portion but smaller than the entire size of the file; in response to determining that the second calculated checksum matches a checksum stored in the suspect files database; storing information indicating that the file is a suspect file; generating a report that is based, at least in part, upon the identification of the file as a suspect file; and assisting in taking at least one remedial measure related to the first file; and if the initial determination determines that the first portion of the file does not have data corresponding to one or more particular characteristics; identifying the file as a suspect file based on a review of the file and storing the calculated first checksum in the suspect files database. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A computer system, comprising:
-
a processor; and a computer-readable storage medium having stored thereon instructions that are executable by the processor to cause the computer system to perform operations comprising; scanning a file directory to identify at least one suspect content stored in a content database; calculating a first value corresponding to a first content in the file directory, wherein the first value is calculated based on a first portion of the first content that is smaller than an entire size of the first content; making an initial determination, based on the first value matching a particular value in a suspect content database corresponding to known content, as to whether the first content may have one or more particular characteristics; if the initial determination determines that the first content may have the one or more particular characteristics; calculating a second value corresponding to the first content, wherein the second value is calculated based on a second portion of the first content that is larger than the first portion but smaller than the entire size of the first content; in response to determining the second value matches an additional value in the suspect content database; generating a report that is based, at least in part, upon the one or more particular characteristics; and based on the second value matching the additional value corresponding to the known content, performing a corrective action on the first content; and if the initial determination determines that the first content does not have the one or more particular characteristics; identifying the first content as having the one or more particular characteristics based on a review of the first content. - View Dependent Claims (17, 18, 19, 20, 21)
-
Specification