Handling fragmentation of archived data in cloud/object storage

US 10,705,922 B2
Filed: 01/12/2018
Issued: 07/07/2020
Est. Priority Date: 01/12/2018
Status: Active Grant

First Claim

Patent Images

1. A method for handling fragmentation of archived data in cloud/object storage, the method comprising:

uploading, by a computer system, a first snapshot of a data set to the cloud/object storage, the first snapshot including at least one data object comprising a plurality of data blocks; and

uploading, by the computer system, a second snapshot of the data set to the cloud/object storage, wherein the second snapshot is uploaded at least a predefined number of snapshots after the first snapshot, and wherein the uploading of the second snapshot comprises;

determining, by the computer system, that the second snapshot includes a version of the data object included in the first snapshot;

in response to the determining, selecting, by the computer system, a subset of the plurality of data blocks in the data object of the first snapshot, the subset including data blocks in the plurality of data blocks that have remained the same since the first snapshot;

reading, by the computer system, the subset of the plurality of data blocks from the first snapshot in the cloud/object storage;

adding, by the computer system, the subset of the plurality of data blocks to the version of the data object in the second snapshot; and

uploading, by the computer system, the second snapshot with the subset of the plurality of data blocks to the cloud/object storage.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for handling fragmentation of archived data in cloud/object storage are provided. In one set of embodiments, a computer system can upload a new snapshot of a data set to the cloud/object storage, where the new snapshot comprises a plurality of data blocks, and where the new snapshot is uploaded as one or more data objects and one or more metadata objects. For each data block in the plurality of data blocks, the computer system can identify an existing data object in the cloud/object storage where the data block is currently stored. The computer system can further select, from among the identified existing data objects, a subset of the existing data objects that are part of a snapshot created in the cloud/object storage at least a predefined number of snapshots before the new snapshot, and select one or more data blocks of one or more data objects in the subset that have not be overwritten by another snapshot. The computer system can then upload the one or more data blocks as part of the new snapshot.

33 Citations

21 Claims

1. A method for handling fragmentation of archived data in cloud/object storage, the method comprising:
- uploading, by a computer system, a first snapshot of a data set to the cloud/object storage, the first snapshot including at least one data object comprising a plurality of data blocks; and
  
  uploading, by the computer system, a second snapshot of the data set to the cloud/object storage, wherein the second snapshot is uploaded at least a predefined number of snapshots after the first snapshot, and wherein the uploading of the second snapshot comprises;
  
  determining, by the computer system, that the second snapshot includes a version of the data object included in the first snapshot;
  
  in response to the determining, selecting, by the computer system, a subset of the plurality of data blocks in the data object of the first snapshot, the subset including data blocks in the plurality of data blocks that have remained the same since the first snapshot;
  
  reading, by the computer system, the subset of the plurality of data blocks from the first snapshot in the cloud/object storage;
  
  adding, by the computer system, the subset of the plurality of data blocks to the version of the data object in the second snapshot; and
  
  uploading, by the computer system, the second snapshot with the subset of the plurality of data blocks to the cloud/object storage.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein the determining is performed by accessing a running point view that is maintained locally on the computer system and that indicates, for each data block in the plurality of data blocks, an existing data object where the data block is currently stored.
  - 3. The method of claim 1 wherein the selecting the subset of the plurality of data blocks comprises:
    - examining a bitmap associated with the data object, wherein the bitmap indicates, for each data block in the plurality of data blocks, whether the data block has been overwritten by any snapshot other than the first snapshot in the cloud/object storage.
  - 4. The method of claim 3 wherein thedata object in the first snapshot has a larger number of overwritten data blocks than non-overwritten data blocks.
  - 5. The method of claim 4 wherein prior to the adding, the version of the data object in the second snapshot includes a second subset of the plurality of data blocks that have been modified since the first snapshot.
  - 6. The method of claim 1 wherein a total size of the subset of the plurality of data blocks is constrained by a user-defined parameter.
  - 7. The method of claim 6 wherein the user-defined parameter is expressed as a percentage of a total size of the second snapshot.

8. A non-transitory computer readable storage medium having stored thereon program code executable by a computer system, the program code embodying a method for handling fragmentation of archived data in cloud/object storage, the method comprising:
- uploading a first snapshot of a data set to the cloud/object storage, the first snapshot including at least one data object comprising a plurality of data blocks; and
  
  uploading a second snapshot of the data set to the cloud/object storage, wherein the second snapshot is uploaded at least a predefined number of snapshots after the first snapshot, and wherein the uploading of the second snapshot comprises;
  
  determining that the second snapshot includes a version of the data object included in the first snapshot;
  
  in response to the determining, selecting a subset of the plurality of data blocks in the data object of the first snapshot, the subset including data blocks in the plurality of data blocks that have remained the same since the first snapshot;
  
  reading the subset of the plurality of data blocks from the first snapshot in the cloud/object storage;
  
  adding the subset of the plurality of data blocks to the version of the data object in the second snapshot; and
  
  uploading the second snapshot with the subset of the plurality of data blocks to the cloud/object storage.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The non-transitory computer readable storage medium of claim 8 wherein the determining is performed by accessing a running point view that is maintained locally on the computer system and that indicates, for each data block in the plurality of data blocks, an existing data object where the data block is currently stored.
  - 10. The non-transitory computer readable storage medium of claim 8 wherein the selecting the subset of the plurality of data blocks comprises:
    - examining a bitmap associated with the data object, wherein the bitmap indicates, for each data block in the plurality of data objects, whether the data block has been overwritten by any snapshot other than the first snapshot in the cloud/object storage.
  - 11. The non-transitory computer readable storage medium of claim 10 wherein thedata object in the first snapshot has a larger number of overwritten data blocks than non-overwritten data blocks.
  - 12. The non-transitory computer readable storage medium of claim 11 wherein prior to the adding, the version of the data object in the second snapshot includes a second subset of the plurality of data blocks that have been modified since the first snapshot.
  - 13. The non-transitory computer readable storage medium of claim 8 wherein a total size of the subset of the plurality of data blocks is constrained by a user-defined parameter.
  - 14. The non-transitory computer readable storage medium of claim 13 wherein the user-defined parameter is expressed as a percentage of a total size of the second snapshot.

15. A computer system comprising:
- a processor; and
  
  a non-transitory computer readable medium having stored thereon program code for handling fragmentation of archived data in cloud/object storage, the program code causing the processor to;
  
  upload a first snapshot of a data set to the cloud/object storage, the first snapshot including at least one data object comprising a plurality of data blocks; and
  
  upload a second snapshot of the data set to the cloud/object storage, wherein the second snapshot is uploaded at least a predefined number of snapshots after the first snapshot, and wherein the uploading of the second snapshot comprises;
  
  determine that the second snapshot includes a version of the data object included in the first snapshot;
  
  in response to the determining, select a subset of the plurality of data blocks in the data object of the first snapshot, the subset including data blocks in the plurality of data blocks that have remained the same since the first snapshot;
  
  read the subset of the plurality of data blocks from the first snapshot in the cloud/object storage;
  
  add the subset of the plurality of data blocks to the version of the data object in the second snapshot; and
  
  upload the second snapshot with the subset of the plurality of data blocks to the cloud/object storage.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The computer system of claim 15 wherein the determining is performed by accessing a running point view that is maintained locally on the computer system and that indicates, for each data block in the plurality of data blocks, an existing data object where the data block is currently stored.
  - 17. The computer system of claim 15 wherein the selecting the subset of the plurality of data blocks comprises:
    - examining a bitmap associated with the data object, wherein the bitmap indicates, for each data block in the plurality of data blocks, whether the data block has been overwritten by any snapshot other than the first snapshot in the cloud/object storage.
  - 18. The computer system of claim 17 wherein thedata object in the first snapshot has a larger number of overwritten data blocks than non-overwritten data blocks.
  - 19. The computer system of claim 18 wherein prior to the adding, the version of the data object in the second snapshot includes a second subset of the plurality of data blocks that have been modified since the first snapshot.
  - 20. The computer system of claim 15 wherein a total size of the subset of the plurality of data blocks is constrained by a user-defined parameter.
  - 21. The computer system of claim 20 wherein the user-defined parameter is expressed as a percentage of a total size of the second snapshot.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Vmware LLC (Broadcom, Inc.)
Original Assignee
VMware, Inc. (Broadcom, Inc.)
Inventors
Kashi Visvanathan, Satish Kumar, Sarda, Pooja, Langouev, Ilya, Kandambakkam, Arun
Primary Examiner(s)
Woo, Isaac M

Application Number

US15/870,740
Publication Number

US 20190220367A1
Time in Patent Office

907 Days
Field of Search

707600-899
US Class Current
CPC Class Codes

G06F 11/1448   Management of the data invo...

G06F 11/1464   for networked environments

G06F 11/1469   Backup restoration techniques

G06F 11/2094   Redundant storage or storag...

G06F 16/2219   Large Object storage; Manag...

G06F 16/275   Synchronous replication

G06F 2201/82   Solving problems relating t...

G06F 2201/84   Using snapshots, i.e. a log...

H04L 67/1097   for distributed storage of ...

Handling fragmentation of archived data in cloud/object storage

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

33 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Handling fragmentation of archived data in cloud/object storage

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

33 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links