METHODS AND APPARATUS FOR DEDUPLICATION IN STORAGE SYSTEM
First Claim
1. A computerized data storage system comprising:
- a. At least one host computer;
b. A management terminal; and
c. A storage system comprising;
i. An interface operable to communicate with the at least one host computer;
ii. A storage device comprising a plurality of data objects; and
aiii. A deduplication controller operable to perform a deduplication of data stored in the storage device, wherein the deduplication controller maintains a threshold with respect to allowed degree of deduplication, counts a number of links for each data object and does not perform deduplication when the counted number of links for the data object exceeds the threshold even if duplication is detected.
1 Assignment
0 Petitions
Accused Products
Abstract
In one implementation, a storage system comprises host computers, a management terminal and a storage system having block interface to communicate with the host computers/clients. The storage system also incorporates a deduplication capability using chunks (divided storage area). The storage system maintains a threshold (upper limit) with respect to the degree of deduplication (i.e. number of virtual data for one real data) specified by users or the management software. The storage system counts the number of links for each chunk and does not perform deduplication when the number of reduced data for a chunk exceeds the threshold, even if duplication is detected. In another implementation, the storage system additionally incorporates a data migration capability and migrates physical data to high reliability area such as area protected with double parity (i.e. RAID6) when the deduplication level for a chunk exceeds the threshold.
174 Citations
29 Claims
-
1. A computerized data storage system comprising:
-
a. At least one host computer; b. A management terminal; and c. A storage system comprising; i. An interface operable to communicate with the at least one host computer; ii. A storage device comprising a plurality of data objects; and
aiii. A deduplication controller operable to perform a deduplication of data stored in the storage device, wherein the deduplication controller maintains a threshold with respect to allowed degree of deduplication, counts a number of links for each data object and does not perform deduplication when the counted number of links for the data object exceeds the threshold even if duplication is detected. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computerized data storage system comprising:
-
a. At least one host computer; b. A management terminal; and c. A storage system comprising; i. An interface operable to communicate with the at least one host computer; ii. A normal reliability storage area; iii. A high reliability data storage area; iv. A data migration controller operable to migrate data between the normal reliability storage area and the high reliability data storage area; and v. A deduplication controller operable to perform deduplication of data stored in the normal reliability data storage area or the high reliability data storage area, wherein the deduplication controller maintains a threshold with respect to allowed degree of deduplication and counts a number of links for each object; and
wherein the deduplication controller is operable to cause the data migration controller to migrate a data object to the high reliability storage area when the counted number of links for the data object exceeds the threshold. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method performed by a storage system comprising an interface operable to communicate with at least one host computer and at least one storage device comprising a plurality of data objects;
- the method comprising;
a. Determining whether a first data is duplicated in the at least one duplicate data object; b. Maintaining a threshold with respect to allowed degree of deduplication; c. Counting a number of links for the at least one duplicate data object; d. If the first data is duplicated in the at lest one duplicate data object and if the counted number of links does not exceed the threshold, performing deduplication of the data in the at least one duplicate data object; and e. If the counted number of links exceeds the threshold, not performing the deduplication of the data in the at least one duplicate data object. - View Dependent Claims (15, 16, 17, 18, 19)
- the method comprising;
-
20. A method performed by a storage system comprising an interface operable to communicate with at least one host computer and at least one storage device comprising a plurality of data objects;
- the method comprising;
a. Determining whether the a data is duplicated in the at least one duplicate data object of the plurality of data objects; b. Maintaining a threshold with respect to allowed degree of deduplication; c. Counting a number of links for the at least one duplicate data object; d. If the first data is duplicated in the at lest one duplicate data object, performing deduplication of the data in the at least one duplicate data object; and e. If the counted number of links exceeds the threshold, migrating the at least one duplicate data object to a high reliability storage area. - View Dependent Claims (21, 22, 23, 24, 25)
- the method comprising;
-
26. A computer-readable medium storing a set of instruction, the set of instructions, when executed by a storage system comprising an interface operable to communicate with at least one host computer and at least one storage device comprising a plurality of data objects;
- causing the storage system to;
a. Determine whether a first data is duplicated in the at least one duplicate data object; b. Maintain a threshold with respect to allowed degree of deduplication; c. Count a number of links for the at least one duplicate data object; d. If the first data is duplicated in the at lest one duplicate data object and if the counted number of links does not exceed the threshold, perform deduplication of the data in the at least one duplicate data object; and e. If the counted number of links exceeds the threshold, not perform the deduplication of the data in the at least one duplicate data object. - View Dependent Claims (27, 29)
- causing the storage system to;
-
28. A computer-readable medium storing a set of instruction, the set of instructions, when executed by a storage system comprising an interface operable to communicate with at least one host computer and at least one storage device comprising a plurality of data objects;
- causing the storage system to;
a. Determine whether a first data is duplicated in the at least one duplicate data object of the plurality of data objects; b. Maintain a threshold with respect to allowed degree of deduplication; c. Count a number of links for the at least one duplicate data object; d. If the first data is duplicated in the at lest one duplicate data object, perform deduplication of the data in the at least one duplicate data object; and e. If the counted number of links exceeds the threshold, migrate the at least one duplicate data object to a high reliability storage area.
- causing the storage system to;
Specification