Hybrid drive caching in a backup system with SSD deletion management
First Claim
1. A data storage system for performing secondary copy operations, the system comprising:
- a client computing device residing in a primary storage subsystem and having at least one software application installed thereon, the client computing device associated with and in networked communication with a storage system located in the primary storage subsystem, the storage system comprising a hard disk and a solid-state drive (SSD), the storage system configured to store primary data generated by the software application;
a storage manager implemented in a computing device and configured to instruct the client computing device to perform tasks associated with one or more data backup operations in which the primary data is copied to one or more secondary storage devices residing in a secondary storage subsystem to create one or more secondary copies having a different format than a native format of the primary data;
a storage driver that controls SSD cache operations as part of performing storage operations which are part of the data backup operations, the storage driver implemented in a hardware processor residing within the primary storage subsystem and configured to;
read a first data element from the hard disk;
store a first indication in memory that the first data element is to be cached in the SSD without actually caching the first data element in the SSD;
write the first data element to a buffer maintained in the memory;
read a second data element from the hard disk;
store a second indication in the memory that the second data element is to be cached in the SSD;
write the second data element to the buffer;
subsequent to storage of the second indication in the memory, determine that the buffer has reached capacity; and
in response to determining that the buffer has reached capacity;
determine whether the SSD is at capacity;
in response to determining that the SSD is at capacity, access a data structure in the memory, the data structure including entries corresponding to storage locations on the SSD that store cached data elements, wherein each respective entry in the data structure is usable to locate in the memory first information indicative of how recently accessed was the cached data element stored in the storage location to which the respective entry corresponds;
consult a plurality of entries in the data structure to locate the first information associated with a plurality of the cached data elements, to identify one or more of the plurality of data elements as candidates to discard;
discard one or more of the candidates from the SSD;
subsequent to discarding the one or more candidates, write the first and second data elements from the buffer to the SSD; and
update the data structure in the memory to include entries corresponding to the first and second data elements.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods can implement one or more intelligent caching algorithms that reduce wear on the SSD and/or to improve caching performance. Such algorithms can improve storage utilization and I/O efficiency by taking into account the write-wearing limitations of the SSD. Accordingly, the systems and methods can cache to the SSD while avoiding writing too frequently to the SSD to increase or attempt to increase the lifespan of the SSD. The systems and methods may, for instance, write data to the SSD once that data has been read from the hard disk or memory multiple times to avoid or attempt to avoid writing data that has been read only once. The systems and methods may also write large chunks of data to the SSD at once instead of a single unit of data at a time. Further, the systems and methods can write to the SSD in a circular fashion.
-
Citations
10 Claims
-
1. A data storage system for performing secondary copy operations, the system comprising:
-
a client computing device residing in a primary storage subsystem and having at least one software application installed thereon, the client computing device associated with and in networked communication with a storage system located in the primary storage subsystem, the storage system comprising a hard disk and a solid-state drive (SSD), the storage system configured to store primary data generated by the software application; a storage manager implemented in a computing device and configured to instruct the client computing device to perform tasks associated with one or more data backup operations in which the primary data is copied to one or more secondary storage devices residing in a secondary storage subsystem to create one or more secondary copies having a different format than a native format of the primary data; a storage driver that controls SSD cache operations as part of performing storage operations which are part of the data backup operations, the storage driver implemented in a hardware processor residing within the primary storage subsystem and configured to; read a first data element from the hard disk; store a first indication in memory that the first data element is to be cached in the SSD without actually caching the first data element in the SSD; write the first data element to a buffer maintained in the memory; read a second data element from the hard disk; store a second indication in the memory that the second data element is to be cached in the SSD; write the second data element to the buffer; subsequent to storage of the second indication in the memory, determine that the buffer has reached capacity; and in response to determining that the buffer has reached capacity; determine whether the SSD is at capacity; in response to determining that the SSD is at capacity, access a data structure in the memory, the data structure including entries corresponding to storage locations on the SSD that store cached data elements, wherein each respective entry in the data structure is usable to locate in the memory first information indicative of how recently accessed was the cached data element stored in the storage location to which the respective entry corresponds; consult a plurality of entries in the data structure to locate the first information associated with a plurality of the cached data elements, to identify one or more of the plurality of data elements as candidates to discard; discard one or more of the candidates from the SSD; subsequent to discarding the one or more candidates, write the first and second data elements from the buffer to the SSD; and update the data structure in the memory to include entries corresponding to the first and second data elements. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of performing data backup operations, comprising:
-
with a storage manager implemented in a computing device, instructing a client computing device residing in a primary storage subsystem to perform tasks associated with one or more data backup operations in which primary data generated by at least one software application executing on the client computing device is copied to one or more secondary storage devices residing in a secondary storage subsystem to create one or more secondary copies having a different format than a native format of the primary data, the client computing device associated with and in networked communication with a storage system located in the primary storage subsystem, the storage system comprising a hard disk and a solid-state drive (SSD) operating as a cache for the hard disk, the storage system configured to store primary data generated by the software application; with a storage driver implemented in a hardware processor and that controls SSD cache operations as part of performing storage operations which are part of the data backup operations; reading a first data element from the hard disk; storing a first indication in memory that the first data element is to be cached in the SSD without actually caching the first data element in the SSD; writing the first data element to a buffer maintained in the memory; reading a second data element from the hard disk; storing a second indication in the memory that the second data element is to be cached in the SSD; writing the second data element to the buffer; subsequent to storage of the second indication in the memory, determining that the buffer has reached capacity; and in response to determining that the buffer has reached capacity; determining whether the SSD is at capacity; in response to determining that the SSD is at capacity, accessing a data structure in the memory, the data structure including entries corresponding to storage locations on the SSD that store cached data elements, wherein each respective entry in the data structure is usable to locate in the memory information indicative of how recently accessed was the cached data element stored in the storage location to which the respective entry corresponds; consulting a plurality of entries in the data structure to locate the first information associated with a plurality of the cached data elements, to identify one or more of the plurality of data elements as candidates to discard; discarding one or more of the candidates from the SSD; subsequent to said discarding the one or more candidates, writing the first and second data elements from the buffer to the SSD; and updating the data structure in the memory to include entries corresponding to the first and second data elements. - View Dependent Claims (7, 8, 9, 10)
-
Specification