Apparatus, methods and computer programs for controlling performance of operations within a data processing system or network
First Claim
1. A method for controlling performance of an operation in relation to a set of resources within a data processing network, comprising the steps of:
- computing a set of hash values representing a set of resources for which an operation has been performed;
storing the set of hash values;
in response to a requirement for performance of the operation, computing a new set of hash values representing the set of resources;
comparing the new hash values with the stored hash values for the set of resources to identify matches between new hash values and stored hash values;
determining that performance of the operation is not currently required for resources for which a match is identified between the respective new hash value and a stored hash value; and
performing the operation for resources for which no match is identified between the new hash value and any stored hash value.
2 Assignments
0 Petitions
Accused Products
Abstract
Provided are methods, apparatus and computer programs for identifying matching resources (data files and executable files) within a data processing network, by comparison of hash values computed for each of a set of resources. A match between a newly computed hash value and a previously computed hash value for a resource indicates that the resource has not changed since the previous computation. A match between hash values for different resources indicates that they are identical. The result of the comparison can be used to determine whether a virus scan is currently required for a resource, on the basis that a resource which is unchanged since it was classified virus-free remains virus-free and a resource which is identical to a virus-scanned resource does not require duplication of the virus scan. The methods, apparatus and computer programs enable more efficient use of antivirus scanning or management of a backup copy process.
157 Citations
38 Claims
-
1. A method for controlling performance of an operation in relation to a set of resources within a data processing network, comprising the steps of:
-
computing a set of hash values representing a set of resources for which an operation has been performed;
storing the set of hash values;
in response to a requirement for performance of the operation, computing a new set of hash values representing the set of resources;
comparing the new hash values with the stored hash values for the set of resources to identify matches between new hash values and stored hash values;
determining that performance of the operation is not currently required for resources for which a match is identified between the respective new hash value and a stored hash value; and
performing the operation for resources for which no match is identified between the new hash value and any stored hash value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for controlling scanning for computer viruses within a data processing network, comprising the steps of:
-
computing a set of hash values representing a set of resources which have been determined to be virus-free;
storing the set of hash values;
in response to a requirement for a virus check, computing a new set of hash values representing the set of resources;
comparing the new hash values with the stored hash values for the set of resources to identify matches between new hash values and stored hash values;
determining that no virus scan is currently required for resources for which a match is identified between the new hash value and a stored hash value; and
performing a virus scan for each resource for which no match is identified between the new hash value and any stored hash value. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A method for controlling scanning for computer viruses within a data processing network, comprising the steps of:
-
for a set of resources determined to be virus free, storing a set of hash values derived from the set of resources;
in response to a subsequent requirement for a virus check, comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between new and stored hash values;
identifying the resources for which the new hash values match stored hash values, and determining that a virus scan is not currently required for the identified resources for which the new hash values match the stored hash values; and
initiating a virus scan for resources for which the new hash values do not match stored hash values. - View Dependent Claims (25, 26, 27, 28, 29)
-
-
30. A data processing apparatus comprising:
-
a data processing unit;
a data storage unit; and
a repository manager configured to store, in at least one repository within the data storage unit, a set of hash values derived from a set of resources determined to be virus free; and
a virus scan coordinator for comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between the new hash values and stored hash values, for identifying resources for which the respective new hash values match stored hash values, for initiating a virus scan for resources for which the respective new hash values do not match stored hash values and for controlling the repository manager to store in the repository an indication that a virus scan is not currently required for at least some of the identified resources.
-
-
31. A data processing apparatus comprising:
-
a data processing unit;
a data storage unit; and
a repository manager configured to store a set of hash values in at least one repository within the data storage unit, wherein the hash values have been derived from a set of resources for which the operation has been performed; and
a coordinator for coordinating performance of an operation, for comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between the new hash values and stored hash values, for identifying resources for which the respective new hash values match stored hash values, and for controlling the repository manager to record in the repository an indication that at least some of the identified resources do not currently require performance of the operation.
-
-
32. A distributed data processing system comprising:
-
a first data processing apparatus comprising a data processing unit;
a data storage unit;
a repository manager configured to store a set of hash values derived from a set of resources in at least one repository within the data storage unit; and
a virus scan coordinator for comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between the new hash values and stored hash values, for identifying resources for which the respective new hash values match stored hash values, and for controlling the repository manager to store in the repository an indication that at least some of the identified resources do not currently require a virus scan;
andat least a second data processing apparatus comprising a data processing unit;
a data storage unit for storing at least one resource of the set of resources; and
a hash value generator, the hash value generator being configured to compute a set of hash values for said at least one resource and to send the set of hash values to the coordinator.
-
-
33. A distributed data processing network comprising:
-
at least a first data processing system comprising a data processing unit;
a data storage unit;
a repository manager configured to store a set of hash values derived from a set of resources in at least one repository within the data storage unit; and
a coordinator for coordinating performance of an operation, for comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between the new hash values and stored hash values, for identifying resources for which the respective new hash values match stored hash values, and for controlling the repository manager to record in the repository an indication that at least some of the identified resources do not currently require performance of the operation; and
at least a second data processing system comprising a data processing unit;
a data storage unit for storing at least one resource of the set of resources; and
a hash value generator, the hash value generator being configured to compute a set of hash values for said at least one resource and to send the set of hash values to the coordinator. - View Dependent Claims (34, 35)
-
-
36. A computer program product, comprising program code recorded on a recording medium, for controlling the performance of operations on a data processing system on which the program code executes, wherein the program code comprises:
-
a repository manager configured to store a set of hash values in at least one repository, for a set of resources determined to be virus free; and
a virus scan coordinator for comparing a new set of hash values derived from the set of resources with the stored set of hash values to identify matches between the new hash values and stored hash values, for identifying resources for which the respective new hash values match stored hash values, and for controlling the repository manager to store in the repository an indication that at least some of the identified resources do not currently require a virus scan.
-
-
37. A method for controlling scanning for computer viruses within a data processing network, comprising the steps of:
-
installing a set of virus-free decoy resources on a data processing system;
storing a set of hash values derived from the set of decoy resources;
in response to a subsequent requirement for a virus check, comparing a new set of hash values derived from the set of decoy resources with the stored set of hash values to identify matches between new and stored hash values;
classifying decoy resources for which the new hash values match stored hash values as virus-free; and
classifying decoy resources for which the new hash values do not match any stored hash values as contaminated. - View Dependent Claims (38)
-
Specification