×

Managing data storage in a set of storage systems using usage counters

  • US 9,830,101 B2
  • Filed: 08/07/2014
  • Issued: 11/28/2017
  • Est. Priority Date: 09/11/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for data access in a storage infrastructure, the storage infrastructure comprising a host system connected to at least a first storage system and a second storage system, the storage infrastructure further comprising a de-duplication module maintaining a data structure comprising one or more entries, each entry of the one or more entries comprising a hash value, a data location, an identifier, a first usage count and a second usage count for a data chunk, wherein the first usage count and the second usage count are associated with the first storage system and the second storage system, respectively, the first storage system and the second storage system comprising a first reference table and a second reference table, respectively, the method comprising:

  • receiving, by the first storage system from the host system, a write request for storing the data chunk, wherein the write request is indicative of a first identifier of the data chunk;

    calculating, by the first storage system, a hash value of the data chunk using a hash function;

    determining, by the first storage system, a first storage location for the data chunk in the first storage system;

    sending, by the first storage system, a write message including the hash value, the first identifier and the first storage location to the de-duplication module;

    determining, by the de-duplication module, whether the hash value exists in the data structure;

    responsive to the hash value existing in the data structure, incrementing, by the de-duplication module, the first usage count of the data chunk;

    responsive to the hash value failing to exist in the data structure, adding, by the de-duplication module, an entry to the data structure comprising the hash value, the first storage location, the first identifier, the first usage count set to one and the second usage count set to zero;

    receiving, by the first storage system, a response message from the de-duplication module, wherein;

    responsive to the hash value existing in the data structure, the response message comprising a second storage location, a second identifier, the first usage count and the second usage count associated with the hash value and wherein;

    responsive to a determination that the first usage count is higher than a predetermined maximum usage value, storing, by the first storage system, the data chunk in the first storage location, thereby duplicating the data chunk and adding an entry in the first reference table including the first identifier and the first storage location, andresponsive to a determination that the first usage count fails to be higher than the predetermined maximum usage value, adding, by the first storage system, an entry in the first reference table with the first identifier, the second storage location and the second identifier; and

    responsive to the hash value failing to exist in the data structure, the response message comprises instructions for storing the data chunk in the first storage location and storing, by the first storage system, the data chunk in the first storage location and adding an entry in the first reference table including the first identifier and the first storage location.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×