Cloud storage system with distributed metadata
First Claim
1. A computer implemented method for avoiding single metadata server bottlenecks on processing cloud storage system (CSS) object metadata, the computer executing the following operations:
- dividing a metadata storage for a cloud storage system between Object metadata that encodes information on an object and versions of the object and Chunk metadata that tracks the locations of chunks;
identifying a reference to each chunk in the object metadata using a globally unique permanent chunk identifier which is never re-used to identify a different payload;
maintaining the CSS object metadata in a shared global name space, wherein the CSS object metadata is distributed over a plurality of object metadata servers (OMDS);
wherein one or more Enhanced Chunk Servers (ECS) or one or more Chunk Metadata Servers (CMDS) encode a chunk as a file named based upon a chunk class of storage and the permanent chunk identifier, wherein data of the file holds a compressed payload for a respective chunk as data, and the metadata of the file encodes the identifiers of zero or more other ECSs known to hold the payload for the respective chunk;
wherein one of the ECSs will acknowledge each valid put of a chunk with a chunk cookie that encodes the permanent chunk identifier for the chunk, the length of the chunk data after compression and context information supplied by a Cloud Storage Access Module (CSAM), usable by an OMDS to validate a commit of the object; and
wherein an ECS that receives a request to put a chunk that is identified with a deferred chunk identifier will;
validate that a checksum encoded in the deferred chunk identifier is valid;
determine the permanent chunk identifier for the chunk payload;
store the chunk under the permanent chunk identifier, responsive to determining that the permanent chunk identifier is not already present; and
provide a chunk cookie as an acknowledgement.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system is disclosed for providing a cloud storage system supporting existing APIs and protocols. The method of storing cloud storage system (CSS) object metadata separates object metadata that describes each CSS object as a collection of named chunks with chunk locations specified as a separate part of the metadata. Chunks are identified using globally unique permanent identifiers that are never re-used to identify different chunk payload. While avoiding the bottleneck of a single metadata server, the disclosed system provides ordering guarantees to clients such as guaranteeing access to the most recent version of an object. The disclosed system also provides end-to-end data integrity protection, inline data deduplication, configurable replication, hierarchical storage management and location-aware optimization of chunk storage.
59 Citations
18 Claims
-
1. A computer implemented method for avoiding single metadata server bottlenecks on processing cloud storage system (CSS) object metadata, the computer executing the following operations:
-
dividing a metadata storage for a cloud storage system between Object metadata that encodes information on an object and versions of the object and Chunk metadata that tracks the locations of chunks; identifying a reference to each chunk in the object metadata using a globally unique permanent chunk identifier which is never re-used to identify a different payload; maintaining the CSS object metadata in a shared global name space, wherein the CSS object metadata is distributed over a plurality of object metadata servers (OMDS); wherein one or more Enhanced Chunk Servers (ECS) or one or more Chunk Metadata Servers (CMDS) encode a chunk as a file named based upon a chunk class of storage and the permanent chunk identifier, wherein data of the file holds a compressed payload for a respective chunk as data, and the metadata of the file encodes the identifiers of zero or more other ECSs known to hold the payload for the respective chunk; wherein one of the ECSs will acknowledge each valid put of a chunk with a chunk cookie that encodes the permanent chunk identifier for the chunk, the length of the chunk data after compression and context information supplied by a Cloud Storage Access Module (CSAM), usable by an OMDS to validate a commit of the object; and wherein an ECS that receives a request to put a chunk that is identified with a deferred chunk identifier will; validate that a checksum encoded in the deferred chunk identifier is valid; determine the permanent chunk identifier for the chunk payload; store the chunk under the permanent chunk identifier, responsive to determining that the permanent chunk identifier is not already present; and provide a chunk cookie as an acknowledgement. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification