Global data deduplication across multiple distributed file systems
First Claim
1. A method for deduplicating data across multiple distributed file systems, the method comprising:
- providing an object version manager that is accessible by a first distributed file system associated with a first metadata server (“
MDS”
) and that is accessible by a second distributed file system associated with a second MDS, wherein the first distributed file system is separate from the second distributed file system;
transmitting a write request from a client to the first MDS, wherein the write request comprises an object identifier associated with a data object;
receiving an object store location for an object store from the first MDS;
transmitting a metadata request to the object store using the object store location prior to transmitting the data object, wherein the metadata request includes the object identifier and wherein the object store is separate from the first MDS;
receiving a metadata response from the object store to the metadata request;
determining the metadata response contains an object designator; and
incrementing a count associated with a mapping between the object identifier and the object designator in the object version manager shared with the second MDS included in the second distributed file system;
tracking how many instances of the data object are stored in the first distributed file system and in the second distributed file system using the object version manager and the object designator; and
maintaining the count in the object version manager and apart from the first distributed file system and the second distributed file system; and
globally deduplicating data objects common to the first distributed file system and the second distributed file system using the object version manager.
13 Assignments
0 Petitions
Accused Products
Abstract
A write request is transmitted from a client to a metadata server (“MDS”), wherein the write request comprises an object identifier associated with a data object. An object store location is received for an object store from the MDS. A metadata request is transmitted to the object store using the object store location, wherein the metadata request includes the object identifier. A metadata response is received from the object store. Determine the metadata response contains an object designator. A count associated with a mapping between the object identifier and the object designator is incremeneted, wherein the mapping resides on an object version manager shared with a second MDS.
19 Citations
6 Claims
-
1. A method for deduplicating data across multiple distributed file systems, the method comprising:
-
providing an object version manager that is accessible by a first distributed file system associated with a first metadata server (“
MDS”
) and that is accessible by a second distributed file system associated with a second MDS, wherein the first distributed file system is separate from the second distributed file system;transmitting a write request from a client to the first MDS, wherein the write request comprises an object identifier associated with a data object; receiving an object store location for an object store from the first MDS; transmitting a metadata request to the object store using the object store location prior to transmitting the data object, wherein the metadata request includes the object identifier and wherein the object store is separate from the first MDS; receiving a metadata response from the object store to the metadata request; determining the metadata response contains an object designator; and incrementing a count associated with a mapping between the object identifier and the object designator in the object version manager shared with the second MDS included in the second distributed file system; tracking how many instances of the data object are stored in the first distributed file system and in the second distributed file system using the object version manager and the object designator; and maintaining the count in the object version manager and apart from the first distributed file system and the second distributed file system; and globally deduplicating data objects common to the first distributed file system and the second distributed file system using the object version manager. - View Dependent Claims (2, 3)
-
-
4. A non-transitory computer readable storage medium comprising processor instructions for deduplicating data across multiple distributed file systems, the instructions comprising:
-
providing an object version manager that is accessible by a first distributed file system associated with a first metadata server (“
MDS”
) and that is accessible by a second distributed file system associated with a second MDS, wherein the first distributed file system is separate from the second distributed file system;transmitting a write request from a client to the first MDS, wherein the write request comprises an object identifier associated with a data object; receiving an object store location for an object store from the first MDS; transmitting a metadata request to the object store using the object store location prior to transmitting the data object, wherein the metadata request includes the object identifier and wherein the object store is separate from the first MDS; receiving a metadata response from the object store to the metadata request; determining the metadata response contains an object designator; and incrementing a count associated with a mapping between the object identifier and the object designator in the object version manager shared with the second MDS included in the second distributed file system; tracking how many instances of the data object are stored in the first distributed file system and in the second distributed file system using the object version manager and the object designator; and maintaining the count in the object version manager and apart from the first distributed file system and the second distributed file system; and globally deduplicating data objects common to the first distributed file system and the second distributed file system using the object version manager. - View Dependent Claims (5, 6)
-
Specification