Method and system for uploading data into a distributed storage system
First Claim
1. A computer-implemented method for uploading an object into a distributed storage system, wherein the distributed storage system includes a plurality of chunk stores, comprising:
- at a computing device, distinct from the distributed storage system, having one or more processors and memory storing programs executed by the one or more processors, wherein the computing device is connected to the distributed storage system through a network;
splitting an object into one or more chunks, wherein the one or more chunks have a predefined sequence and a respective chunk has a chunk ID, a chunk offset, and a chunk size;
uploading the one or more chunks into the distributed storage system;
for a respective uploaded chunk,receiving a write token from the distributed storage system after the respective uploaded chunk has been stored into the distributed storage system, wherein the write token identifies a respective chunk store that stores the uploaded chunk and includes the chunk ID, the chunk offset, and chunk size of the uploaded chunk;
inserting an entry into an extents table of the object for the uploaded chunk in accordance with the write token received from the distributed storage system, the write token indicating the respective chunk store and including the chunk ID, chunk offset, and chunk size of the uploaded chunk, wherein the extents table is stored at the computing device;
generating a digest of the extents table, wherein the digest represents the one or more chunks that a client expects to be within the distributed storage system, the digest indicating a total number of client-expected chunks; and
sending the digest of the extents table to the distributed storage system, wherein the distributed storage system uses the digest to determine whether it has each of the one or more client-expected chunks, including by comparing the total number of client-expected chunks with a number of chunks of the object stored in the distributed storage system, wherein the computing device re-uploads any missing chunks of the object into the distributed storage system.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for uploading an object into a distributed storage system is implemented at a computing device The computing device splits an object into one or more chunks and uploads the one or more chunks into the distributed storage system. For each uploaded chunk, the computing device receives a write token from the distributed storage system, inserts an entry into an extents table of the object for the chunk in accordance with the received write token and the chunk ID, chunk offset, and chunk size of the chunk, generates a digest of the extents table, the digest representing the one or more chunks that the client expects to be within the distributed storage system, and sends the digest of the extents table to the distributed storage system. The distributed storage system is configured to use the digest to determine whether it has each of the one or more client-expected chunks.
-
Citations
22 Claims
-
1. A computer-implemented method for uploading an object into a distributed storage system, wherein the distributed storage system includes a plurality of chunk stores, comprising:
at a computing device, distinct from the distributed storage system, having one or more processors and memory storing programs executed by the one or more processors, wherein the computing device is connected to the distributed storage system through a network; splitting an object into one or more chunks, wherein the one or more chunks have a predefined sequence and a respective chunk has a chunk ID, a chunk offset, and a chunk size; uploading the one or more chunks into the distributed storage system; for a respective uploaded chunk, receiving a write token from the distributed storage system after the respective uploaded chunk has been stored into the distributed storage system, wherein the write token identifies a respective chunk store that stores the uploaded chunk and includes the chunk ID, the chunk offset, and chunk size of the uploaded chunk; inserting an entry into an extents table of the object for the uploaded chunk in accordance with the write token received from the distributed storage system, the write token indicating the respective chunk store and including the chunk ID, chunk offset, and chunk size of the uploaded chunk, wherein the extents table is stored at the computing device; generating a digest of the extents table, wherein the digest represents the one or more chunks that a client expects to be within the distributed storage system, the digest indicating a total number of client-expected chunks; and sending the digest of the extents table to the distributed storage system, wherein the distributed storage system uses the digest to determine whether it has each of the one or more client-expected chunks, including by comparing the total number of client-expected chunks with a number of chunks of the object stored in the distributed storage system, wherein the computing device re-uploads any missing chunks of the object into the distributed storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A computer-implemented method for storing an object within a plurality of chunk stores of a distributed storage system, comprising:
-
at a computing device having one or more processors and memory storing programs executed by the one or more processors, wherein the computing device is connected to the distributed storage system through a network; receiving from a client a request to store an object having one or more chunks, wherein the one or more chunks have a predefined sequence and a respective chunk has a chunk ID, a chunk offset, and a chunk size; for a respective received chunk, identifying a respective chunk store in accordance with a load balance of the distributed storage system; storing the received chunk within the respective chunk store; and returning a write token for the received chunk to the client after the received chunk has been stored into the distributed storage system, wherein the client inserts an entry into an extents table of the object for the received chunk in accordance with the write token, the write token indicating the respective chunk store and including the chunk ID, chunk offset, and chunk size of the received chunk; receiving a digest of the extents table of the object from the client, wherein the digest represents the one or more chunks that the client expects to be within the distributed storage system, the digest indicating a total number of client-expected chunks; and determining whether the distributed storage system has the one or more client-expected chunks in accordance with the received digest, including by comparing the total number of client-expected chunks with a number of chunks of the object stored in the distributed storage system, wherein the client re-uploads any missing chunks of the object into the distributed storage system. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer system for managing a distributed storage system, comprising:
-
one or more processors; memory; and one or more programs stored in the memory for execution by the one or more processors, the one or more programs comprising instructions to perform; receiving from a client a request to store an object having one or more chunks, wherein the one or more chunks have a predefined sequence and a respective chunk has a chunk ID, a chunk offset, and a chunk size; for a respective received chunk, identifying a respective chunk store in accordance with a load balance of the distributed storage system; storing the received chunk within the respective chunk store; and returning a write token for the received chunk to the client after the received chunk has been stored into the distributed storage system, wherein the client inserts an entry into an extents table of the object for the received chunk in accordance with the write token, the write token indicating the respective chunk store and including the chunk ID, chunk offset, and chunk size of the received chunk; receiving a digest of the extents table of the object from the client, wherein the digest represents the one or more chunks that the client expects to be within the distributed storage system, the digest indicating a total number of client-expected chunks; and determining whether the distributed storage system has the one or more client-expected chunks in accordance with the received digest, including by comparing the total number of client-expected chunks with a number of chunks of the object stored in the distributed storage system, wherein the client re-uploads any missing chunks of the object into the distributed storage system.
-
-
22. A non-transitory computer readable storage medium storing one or more programs configured for execution by a server computer system having one or more processors and memory storing one or more programs for execution by the one or more processors, the one or more programs comprising instructions to:
-
receive from a client a request to store an object having one or more chunks, wherein the one or more chunks have a predefined sequence and a respective chunk has a chunk ID, a chunk offset, and a chunk size; for a respective received chunk, identify a respective chunk store in accordance with a load balance of the distributed storage system; store the received chunk within the respective chunk store; and return a write token for the received chunk to the client after the received chunk has been stored into the distributed storage system, wherein the client inserts an entry into an extents table of the object for the received chunk in accordance with the write token, the write token indicating the respective chunk store and including the chunk ID, chunk offset, and chunk size of the received chunk; receive a digest of the extents table of the object from the client, wherein the digest represents the one or more chunks that the client expects to be within the distributed storage system, the digest indicating a total number of client-expected chunks; and determine whether the distributed storage system has the one or more client-expected chunks in accordance with the received digest, including by comparing the total number of client-expected chunks with a number of chunks of the object stored in the distributed storage system, wherein the client re-uploads any missing chunks of the object into the distributed storage system.
-
Specification