Method and apparatus for efficiently copying distributed data files
First Claim
1. A method for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the method comprising:
- (a) creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
(b) adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical block address;
(c) creating a new covolume that is a copy of the source covolume by copying the root node of the volume map tree to the new covolume wherein the new covolume has a different root node, but shares all of the intermediate nodes and the leaf nodes with the source covolume; and
(d) adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume wherein the new covolume shares all of the intermediate nodes and the leaf nodes with the source covolume except for intermediate nodes and leaf nodes corresponding to data which has been changed.
8 Assignments
0 Petitions
Accused Products
Abstract
Metadata that relates logical block addresses of data used by a file system with the physical block addresses at which the data is stored is maintained in nodes that are arranged into a hierarchical volume map tree that extends from a root node to a plurality of leaf nodes. A copy of the volume map tree root node is maintained in a data structure called a delta that represents the data in the covolume and the modifications to that volume map tree copy represent all additions, changes or deletions made to the volume. A copy of the data, or a new covolume, can be made by creating a new delta containing a new copy of the volume map tree root node as it existed at the time the new covolume was created. The deltas are arranged into a tree structure that represents the data in a covolume and whether the data is unchanged from a previous covolume or changed in that covolume. The delta tree structure can be used to determine whether a delta is not shared so that its covolume can be deleted. Deltas may also be arranged in local delta trees that represent changes to a particular logical is location. The local delta tree can be used to quickly locate unused locations in order to free these locations.
12 Citations
28 Claims
-
1. A method for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the method comprising:
-
(a) creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses; (b) adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical block address; (c) creating a new covolume that is a copy of the source covolume by copying the root node of the volume map tree to the new covolume wherein the new covolume has a different root node, but shares all of the intermediate nodes and the leaf nodes with the source covolume; and (d) adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume wherein the new covolume shares all of the intermediate nodes and the leaf nodes with the source covolume except for intermediate nodes and leaf nodes corresponding to data which has been changed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. Apparatus for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the apparatus comprising:
-
a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses; a mechanism that adds a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical block address; a mechanism for creating a new covolume that is a copy of the source covolume by copying the root node of the volume map tree to the new covolume wherein the new covolume has a different root node, but shares all of the intermediate nodes and the leaf nodes with the source covolume; and a mechanism that adds a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume wherein the new covolume shares all of the intermediate nodes and the leaf nodes with the source covolume except for intermediate and leaf nodes corresponding to data which has been changed. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer program product for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the computer program product comprising a computer usable medium having computer readable program code thereon embedded in memory, including:
-
program code for creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses; program code for adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical block address; program code for creating a new covolume that is a copy of the source covolume by copying the root node of the volume map tree to the new covolume wherein the new covolume has a different root node, but shares all of the intermediate nodes and the leaf nodes with the source covolume; and program code for adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume wherein the new covolume shares all of the intermediate nodes and the leaf nodes with the source covolume except for intermediate nodes and leaf nodes corresponding to data which has been changed. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28)
-
Specification