Metadata Management for fixed content distributed data storage
First Claim
1. In a redundant array of independent nodes networked together, wherein each node executes an instance of an application that provides object-based storage, a metadata management method, comprising:
- storing metadata objects in a set of regions distributed across the array, wherein a given region is identified by hashing a metadata object attribute and extracting a given set of bits of a resulting hash value;
generating a map that, for each region, identifies a node that stores an authoritative copy of the region and thereby is responsible for receiving and responding to update requests directed to the region, and that further identifies zero or more nodes that each store a backup copy of the region and thereby are capable of acting as backup region copies to the authoritative region copy;
distributing the map across the array of independent nodes so that each node has an identical, global view of where the metadata objects are stored;
upon receiving a request to update a given metadata object, using the map to identify the authoritative region copy for the metadata object;
processing the update request;
if synchronization between the authoritative region copy and at least one of its associated backup region copies cannot be maintained as the update is processed, issuing a new map.
4 Assignments
0 Petitions
Accused Products
Abstract
An archival storage cluster of preferably symmetric nodes includes a metadata management system that organizes and provides access to given metadata, preferably in the form of metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. Preferably, a region is selected by hashing one or more object attributes (e.g., the object'"'"'s name) and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster so as to balance the number of authoritative region copies per node, as well as the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.
173 Citations
20 Claims
-
1. In a redundant array of independent nodes networked together, wherein each node executes an instance of an application that provides object-based storage, a metadata management method, comprising:
-
storing metadata objects in a set of regions distributed across the array, wherein a given region is identified by hashing a metadata object attribute and extracting a given set of bits of a resulting hash value;
generating a map that, for each region, identifies a node that stores an authoritative copy of the region and thereby is responsible for receiving and responding to update requests directed to the region, and that further identifies zero or more nodes that each store a backup copy of the region and thereby are capable of acting as backup region copies to the authoritative region copy;
distributing the map across the array of independent nodes so that each node has an identical, global view of where the metadata objects are stored;
upon receiving a request to update a given metadata object, using the map to identify the authoritative region copy for the metadata object;
processing the update request;
if synchronization between the authoritative region copy and at least one of its associated backup region copies cannot be maintained as the update is processed, issuing a new map. - View Dependent Claims (2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
4. The method as described in claim I wherein synchronization cannot be maintained because of a failure associated with one of:
- a backup region copy or the authoritative region copy.
-
14. A method, operative in a redundant array of independent nodes networked together, wherein metadata objects are stored in a set of regions distributed across the array, and wherein a given region is identified by hashing a metadata object attribute and extracting a given set of bits of a resulting hash value, comprising:
-
generating a map that, for each region, identifies a node that stores an authoritative copy of the region and thereby is responsible for receiving and responding to requests directed to the region, and that further identifies zero or more nodes that each store a backup copy of the region and thereby are capable of acting as backup region copies to the authoritative region copy;
distributing the map across the array of independent nodes so that each node has an identical, global view of where the metadata objects are stored, wherein, upon its distribution, the map has associated therewith a guarantee that a given authoritative region copy and its associated zero or more backup region copies are deemed to be synchronized; and
issuing a new map and distributing the new map across the array if the guarantee cannot continue to be assumed. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. In a redundant array of independent nodes networked together, wherein metadata objects are stored in a set of regions distributed across the array, and wherein a given region is identified by hashing a metadata object attribute and extracting a given set of bits of a resulting hash value, a node, comprising:
-
a virtual machine;
a database;
a map stored in the database, wherein, for each region, the map identifies a node that stores an authoritative copy of the region and thereby is responsible for receiving and responding to update requests directed to the region, and that further identifies zero or more nodes that each store a backup copy of the region and thereby are capable of acting as backup region copies to the authoritative region copy;
one or more region manager processes executable in the virtual machine, wherein a given region copy is associated with its own region manager process;
a metadata manager process for generating and managing the one or more region manager processes; and
a client process for receiving and responding to an update request.
-
Specification