Methods and systems for eliminating data redundancies
First Claim
1. A method for eliminating data redundancies in a data processing system, the method comprising the steps of:
- obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
determining whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block; and
when it is determined that the second data block identifier matching the first data block identifier exists, indicating that the first data block identifier is redundant.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and articles of manufacture consistent with the present invention eliminate data redundancies. A first data block identifier is obtained for a first data block, the first data block identifier being calculated based on data of the first data block. It is determined whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block. When it is determined that the second data block identifier matching the first data block identifier exists, the first data block identifier is indicated as being is redundant.
-
Citations
54 Claims
-
1. A method for eliminating data redundancies in a data processing system, the method comprising the steps of:
-
obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
determining whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block; and
when it is determined that the second data block identifier matching the first data block identifier exists, indicating that the first data block identifier is redundant. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 18, 19, 20, 21, 22, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 37, 39, 40, 41, 42, 43, 44, 45)
-
-
13. A method in a data processing system having data blocks with associated identifiers, the method comprising the steps of:
-
receiving a request for a reference to a memory location that stores data, the request comprising the data;
creating a new identifier that is based on the data;
determining whether the new identifier is equivalent to one of the associated identifiers;
when it is determined that the new identifier is equivalent to one of the associated identifiers, returning a reference to the data block that is associated with the one associated identifier.
-
-
15. A method for avoiding data redundancies in a data processing system, the method comprising the steps of:
-
obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
generating a memory allocation request for the first data block;
transmitting the memory allocation request to a redundancy handler, the memory allocation request instructing the redundancy handler to determine whether a second data block identifier matching the first data block identifier exists, wherein the second data block identifier is calculated based on data of a second data block; and
receiving an allocation response indicating whether the second data block identifier of the second data block exists.
-
-
23. A method for eliminating data redundancies in a data processing system, the method comprising the steps of:
-
receiving a first data block;
calculating a first data block identifier based on data of the first data block;
determining whether a second data block identifier matching the first data block identifier exists in a list of other data block identifiers, the second data block identifier being calculated based on data of a second data block;
when it is determined that the second data block identifier matching the first data block identifier exists, deleting the first data block; and
when it is determined that the second data block identifier matching the first data block identifier does not exist, adding the first data block identifier to the list.
-
-
24. A computer-readable medium containing instructions that cause a data processing system to perform a method comprising the steps of:
-
obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
determining whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block; and
when it is determined that the second data block identifier matching the first data block identifier exists, indicating that the first data block identifier is redundant.
-
-
36. A computer-readable medium containing instructions that cause a data processing system having blocks associated with identifiers to perform a method comprising the steps of:
-
receiving a request for a reference to a memory location that stores data, the request comprising the data;
creating a new identifier that is based on the data;
determining whether the new identifier is equivalent to one of the associated identifiers;
when it is determined that the new identifier is equivalent to one of the associated identifiers, returning a reference to the data block that is associated with the one associated identifier.
-
-
38. A computer-readable medium containing instructions that cause a data processing system to perform a method comprising the steps of:
-
obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
generating a memory allocation request for the first data block;
transmitting the memory allocation request to a redundancy handler, the memory allocation request instructing the redundancy handler to determine whether a second data block identifier matching the first data block identifier exists, wherein the second data block identifier is calculated based on data of a second data block; and
receiving an allocation response indicating whether the second data block identifier of the second data block exists.
-
-
46. A computer-readable medium containing instructions that cause a data processing system to perform a method comprising the steps of:
-
receiving a first data block;
calculating a first data block identifier based on data of the first data block;
determining whether a second data block identifier matching the first data block identifier exists in a list of other data block identifiers, the second data block identifier being calculated based on data of a second data block;
when it is determined that the second data block identifier matching the first data block identifier exists, deleting the first data block; and
when it is determined that the second data block identifier matching the first data block identifier does not exist, adding the first data block identifier to the list.
-
-
47. A data processing system comprising:
-
a secondary storage device having a stored data block with data;
a memory comprising a computer program that obtains a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block, determines whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block, and when it is determined that the second data block identifier matching the first data block identifier exists, indicates that the first data block identifier is redundant; and
a processing unit that runs the computer program.
-
-
48. A data processing system comprising:
-
a secondary storage device having a stored data block with data;
a memory comprising a computer program that receives a request for a reference to a memory location that stores data, the request comprising the data, creates a new identifier that is based on the data, determines whether the new identifier is equivalent to one of the associated identifiers, and when it is determined that the new identifier is equivalent to one of the associated identifiers, returns a reference to the data block that is associated with the one associated identifier; and
a processing unit that runs the computer program.
-
-
49. A data processing system comprising:
-
a secondary storage device having a stored data block with data;
a memory comprising a computer program that obtains a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block, generates a memory allocation request for the first data block, transmits the memory allocation request to a redundancy handler, the memory allocation request instructing the redundancy handler to determine whether a second data block identifier matching the first data block identifier exists, wherein the second data block identifier is calculated based on data of a second data block, and receives an allocation response indicating whether the second data block identifier of the second data block exists; and
a processing unit that runs the computer program.
-
-
50. A data processing system for eliminating data redundancies, the data processing system comprising:
-
means for obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
means for determining whether a second data block identifier matching the first data block identifier exists, the second data block identifier being calculated based on data of a second data block; and
means for, when it is determined that the second data block identifier matching the first data block identifier exists, indicating that the first data block identifier is redundant.
-
-
51. A data processing system for eliminating data redundancies, the data processing system having data blocks with associated identifiers, the data processing system comprising:
-
means for receiving a request for a reference to a memory location that stores data, the request comprising the data;
means for creating a new identifier that is based on the data;
means for determining whether the new identifier is equivalent to one of the associated identifiers;
means for, when it is determined that the new identifier is equivalent to one of the associated identifiers, means for returning a reference to the data block that is associated with the one associated identifier.
-
-
52. A data processing system for eliminating data redundancies, the data processing system comprising:
-
means for obtaining a first data block identifier for a first data block, the first data block identifier being calculated based on data of the first data block;
means for generating a memory allocation request for the first data block;
means for transmitting the memory allocation request to a redundancy handler, the memory allocation request instructing the redundancy handler to determine whether a second data block identifier matching the first data block identifier exists, wherein the second data block identifier is calculated based on data of a second data block; and
means for receiving an allocation response indicating whether the second data block identifier of the second data block exists.
-
-
53. A data processing system for eliminating data redundancies, the data processing system comprising:
-
means for receiving a first data block;
means for calculating a first data block identifier based on data of the first data block;
means for determining whether a second data block identifier matching the first data block identifier exists in a list of other data block identifiers, the second data block identifier being calculated based on data of a second data block;
means for, when it is determined that the second data block identifier matching the first data block identifier exists, deleting the first data block; and
means for, when it is determined that the second data block identifier matching the first data block identifier does not exist, adding the first data block identifier to the list.
-
-
54. A computer-readable memory device encoded with a data structure and a program that accesses the data structure, the program is run by a processor in a data processing system, the data structure having a plurality of entries, each entry comprising:
a reference to a data block that contains data and an identifier that is based on the data using a calculation, wherein when the program receives a request to create a new data block containing new data, the program creates a new identifier based on the new data using the calculation and compares the new identifier to the identifiers in the entries to prevent a data block redundancy.
Specification