Distributed data storage system providing de-duplication of data using block identifiers
First Claim
1. A method for accessing data, the data including one or more storage units, the method comprising:
- receiving an access request for data, the access request including a client address for the data;
determining, by a metadata server, a mapping between the client address and one or more storage unit identifiers for the data, each of the one or more storage unit identifiers uniquely identifying content of a storage unit, wherein the metadata server stores mappings on storage unit identifiers that are referenced by client addresses, and wherein the metadata server comprises a plurality of slice servers, wherein each slice server manages a range of mappings between client addresses and storage unit identifiers for the metadata server; and
sending the one or more storage unit identifiers to one or more block servers, the one or more block servers servicing the request using the one or more storage unit identifiers, wherein the one or more block servers store information on where a storage unit is stored on a block server for a storage unit identifier,wherein multiple client addresses associated with a storage unit with a same storage unit identifier are mapped to a single storage unit stored in a storage medium for a block server.
2 Assignments
0 Petitions
Accused Products
Abstract
An access request including a client address for data is received. A metadata server determines a mapping between the client address and storage unit identifiers for the data. Each of the one or more storage unit identifiers uniquely identifies content of a storage unit and the metadata server stores mappings on storage unit identifiers that are referenced by client addresses. The one or more storage unit identifiers are sent to one or more block servers. The one or more block servers service the request using the one or more storage unit identifiers where the one or more block servers store information on where a storage unit is stored on a block server for a storage unit identifier. Also, multiple client addresses associated with a storage unit with a same storage unit identifier are mapped to a single storage unit stored in a storage medium for a block server.
53 Citations
25 Claims
-
1. A method for accessing data, the data including one or more storage units, the method comprising:
-
receiving an access request for data, the access request including a client address for the data; determining, by a metadata server, a mapping between the client address and one or more storage unit identifiers for the data, each of the one or more storage unit identifiers uniquely identifying content of a storage unit, wherein the metadata server stores mappings on storage unit identifiers that are referenced by client addresses, and wherein the metadata server comprises a plurality of slice servers, wherein each slice server manages a range of mappings between client addresses and storage unit identifiers for the metadata server; and sending the one or more storage unit identifiers to one or more block servers, the one or more block servers servicing the request using the one or more storage unit identifiers, wherein the one or more block servers store information on where a storage unit is stored on a block server for a storage unit identifier, wherein multiple client addresses associated with a storage unit with a same storage unit identifier are mapped to a single storage unit stored in a storage medium for a block server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium containing instructions for accessing data, the data including one or more storage units, the instructions for controlling a computer system to be operable to:
-
receive an access request for data, the access request including a client address for the data; determine, by a metadata server, a mapping between the client address and one or more storage unit identifiers for the data, each of the one or more storage unit identifiers uniquely identifying content of a storage unit, wherein the metadata server stores mappings on storage unit identifiers that are referenced by client addresses, and wherein the metadata server comprises a plurality of slice servers, wherein each slice server manages a range of mappings between client addresses and storage unit identifiers for the metadata server; and send the one or more storage unit identifiers to one or more block servers, the one or more block servers servicing the request using the one or more storage unit identifiers, wherein the one or more block servers store information on where a storage unit is stored on a block server for a storage unit identifier, wherein multiple client addresses associated with a storage unit with a same storage unit identifier are mapped to a single storage unit stored in a storage medium for a block server. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A system comprising:
-
a metadata server comprising a plurality of slice servers, wherein the metadata server is configured to; store mappings on storage unit identifiers that are referenced by client addresses, wherein each slice server manages a range of mappings between client addresses and storage unit identifiers for the metadata server; receive an access request for data, the access request including a client address for the data; and compute one or more storage unit identifiers for the data, each of the one or more storage unit identifiers uniquely identifying content of a storage unit; and a plurality of block servers wherein block servers are associated with different ranges of storage unit identifiers and store information on where a storage unit is stored on a block server for a storage unit identifier, wherein each block server is configured to; receive a storage unit in the one or more storage units if a storage unit identifier corresponds to the range of storage unit identifiers associated with the block server; and service the request using the received storage unit identifier, wherein multiple client addresses associated with a storage unit with a same storage unit identifier are mapped to a single storage unit stored in a storage medium in a block server. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification