Method and apparatus for efficiently copying distributed data files
First Claim
1. A method for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the method comprising:
- (a) creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
(b) adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical bock address;
(c) creating a new covolume from the source covolume by copying the root node of the volume map tree to the new covolume; and
(d) adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume.
8 Assignments
0 Petitions
Accused Products
Abstract
Metadata that relates logical block addresses of data used by a file system with the physical block addresses at which the data is stored is maintained in nodes that are arranged into a hierarchical volume map tree that extends from a root node to a plurality of leaf nodes. A copy of the volume map tree root node is maintained in a data structure called a delta that represents the data in the covolume and the modifications to that volume map tree copy represent all additions, changes or deletions made to the volume. A copy of the data, or a new covolume, can be made by creating a new delta containing a new copy of the volume map tree root node as it existed at the time the new covolume was created. The deltas are arranged into a tree structure that represents the data in a covolume and whether the data is unchanged from a previous covolume or changed in that covolume. The delta tree structure can be used to determine whether a delta is not shared so that its covolume can be deleted. Deltas may also be arranged in local delta trees that represent changes to a particular logical is location. The local delta tree can be used to quickly locate unused locations in order to free these locations.
75 Citations
29 Claims
-
1. A method for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the method comprising:
-
(a) creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
(b) adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical bock address;
(c) creating a new covolume from the source covolume by copying the root node of the volume map tree to the new covolume; and
(d) adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. Apparatus for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the apparatus comprising:
-
a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
a mechanism that adds a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical bock address;
a mechanism for creating a new covolume from the source covolume by copying the root node of the volume map tree to the new covolume; and
a mechanism that adds a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer program product for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the computer program product comprising a computer usable medium having computer readable program code thereon, including:
-
program code for creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
program code for adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical bock address;
program code for creating a new covolume from the source covolume by copying the root node of the volume map tree to the new covolume; and
program code for adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28)
-
-
29. A computer data signal embodied in a carrier wave for efficiently copying a distributed data file consisting of a source covolume having a plurality of data blocks, each data block being located at a physical block address in a storage system, the computer data signal comprising:
-
program code for creating a hierarchical volume map tree having a root node corresponding to the source covolume, a plurality of intermediate nodes and a plurality of leaf nodes, each leaf node corresponding to one of the plurality of data blocks, the volume map tree mapping logical block addresses to physical block addresses;
program code for adding a path to the tree when a write operation in the source covolume writes data with a logical block address to a physical bock address;
program code for creating a new covolume from the source covolume by copying the root node of the volume map tree to the new covolume; and
program code for adding a new path to the tree when a write operation in the new covolume writes data at a logical block address already written by a write operation in the source covolume.
-
Specification