System and method for application aware de-duplication of data blocks in a virtualized storage array
First Claim
1. A method for de-duplication of data, the method comprising:
- creating a master list of metadata for a plurality of data blocks, wherein the master list is ordered according to a number of occurrences of each respective data block of the plurality of data blocks within a storage array;
creating a first sublist of metadata, from the master list of metadata, for a first subset of the plurality of data blocks based on the first subset being duplicated more than a second subset of the plurality of the data blocks;
providing the first sublist of metadata to a first component of a networked storage system;
determining whether a data block being written has a corresponding entry in the master list of metadata based on a determination that the data block being written does not have any corresponding entry in the first sublist of metadata; and
performing an action selected from a group consisting of;
replacing the data block being written with a pointer when it is determined that the data block being written has a corresponding entry in the master list of metadata; and
writing the data block to the storage array when it is determined that the data block being written does not have any corresponding entry in the master list of metadata.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for application aware de-duplication of data blocks in a virtualized storage array is disclosed. In one embodiment, in a method of de-duplication of data, a master list of metadata is created based on a number of occurrences of data blocks within a storage array. A first sublist of metadata is created from the master list of metadata. The first sublist of metadata is provided to a first component of a networked storage system. It is determined whether the data block being written has a corresponding entry in the master list of metadata based on a determination that a data block being written does not have any corresponding entry in the first sublist of metadata. The data block being written is replaced with a pointer based on a determination that the data block being written has a corresponding entry in the master list of metadata.
5 Citations
20 Claims
-
1. A method for de-duplication of data, the method comprising:
-
creating a master list of metadata for a plurality of data blocks, wherein the master list is ordered according to a number of occurrences of each respective data block of the plurality of data blocks within a storage array; creating a first sublist of metadata, from the master list of metadata, for a first subset of the plurality of data blocks based on the first subset being duplicated more than a second subset of the plurality of the data blocks; providing the first sublist of metadata to a first component of a networked storage system; determining whether a data block being written has a corresponding entry in the master list of metadata based on a determination that the data block being written does not have any corresponding entry in the first sublist of metadata; and performing an action selected from a group consisting of; replacing the data block being written with a pointer when it is determined that the data block being written has a corresponding entry in the master list of metadata; and writing the data block to the storage array when it is determined that the data block being written does not have any corresponding entry in the master list of metadata. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory machine readable medium having stored thereon instructions for performing a method comprising machine executable code which when executed by at least one machine, causes the machine to:
-
provide a first sublist of a master list of metadata to a first component of a networked storage system, wherein the first sublist corresponds to a first subset of a plurality of data blocks stored in a storage array, and wherein the first subset is selected based on being duplicated more than a second subset of the plurality of data blocks; determine whether a data block being written has any corresponding entry within the master list of metadata based on a determination that the data block being written does not have any corresponding entry within the first sublist; and perform an action selected from the group consisting of; replacing the data block being written with a pointer to a corresponding block within the storage array when it is determined that the data block being written has a corresponding entry within the master list of metadata; and writing the data block to the storage array when it is determined that the data block being written does not have any corresponding entry in the master list of metadata. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A computing device comprising:
-
a memory containing machine readable medium comprising machine executable code having stored thereon instructions for performing a method of data de-duplication; a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to; create a master list of metadata for a plurality of data blocks stored within a storage array; create a first sublist of metadata from the master list of metadata; provide the first sublist of metadata to a first component for use by a de-duplication agent running thereupon; create a second sublist of metadata from the master list of metadata that is different from the first sublist; provide the second sublist of metadata to a second component for use by a de-duplication agent running thereupon; determine whether a data block being written has a corresponding entry in the master list of metadata, wherein the determining is performed based on the de-duplication agent of the first component determining that the data block does not have a corresponding entry in the first sublist of metadata and based on the de-duplication agent of the second component determining that the data block does not have a corresponding entry in the second sublist of metadata; and perform an action from the group consisting of; replacing the data block being written with a pointer in response to determining that the data block being written has a corresponding entry in the master list of metadata; and writing the data block to the storage array in response to determining that the data block being written does not have a corresponding entry in the master list of metadata. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification