Capacity forecasting for a deduplicating storage system
First Claim
Patent Images
1. A system for managing a storage system, comprising:
- a processor configured to;
receive storage system information from a deduplicating storage system, wherein the deduplicating storage system;
breaks each of at least a subset of files in an input data stream into a plurality of segments;
determines, for each of at least a subset of the plurality of segments, whether the segment has been stored previously; and
stores a segment in the event that the segment is determined to have not been previously stored, wherein at least a subset of stored segments is used to reconstruct more than one of the subset of files in the input data stream;
determine a capacity forecast for the deduplicating storage system based at least in part on the storage system information, wherein the storage system information comprises a cumulative compressed size, wherein the cumulative compressed size comprises a physical used space after deduplicating; and
provide the capacity forecast for the deduplicating storage system, wherein providing the capacity forecast is based at least in part on a capacity forecast model validation criteria, wherein the capacity forecast model validation criteria comprise one or more of;
a threshold R squared value, a threshold number of standard deviations of data points from an expected value of a set of data points, and a threshold slope; and
a memory coupled to the processor and configured to provide the processor with instructions.
9 Assignments
0 Petitions
Accused Products
Abstract
A system for managing a storage system comprises a processor and a memory. The processor is configured to receive storage system information from a deduplicating storage system. The processor is further configured to determine a capacity forecast based at least in part on the storage system information. The processor is further configured to provide a compression forecast. The memory is coupled to the processor and configured to provide the processor with instructions.
-
Citations
17 Claims
-
1. A system for managing a storage system, comprising:
-
a processor configured to; receive storage system information from a deduplicating storage system, wherein the deduplicating storage system; breaks each of at least a subset of files in an input data stream into a plurality of segments; determines, for each of at least a subset of the plurality of segments, whether the segment has been stored previously; and stores a segment in the event that the segment is determined to have not been previously stored, wherein at least a subset of stored segments is used to reconstruct more than one of the subset of files in the input data stream; determine a capacity forecast for the deduplicating storage system based at least in part on the storage system information, wherein the storage system information comprises a cumulative compressed size, wherein the cumulative compressed size comprises a physical used space after deduplicating; and provide the capacity forecast for the deduplicating storage system, wherein providing the capacity forecast is based at least in part on a capacity forecast model validation criteria, wherein the capacity forecast model validation criteria comprise one or more of;
a threshold R squared value, a threshold number of standard deviations of data points from an expected value of a set of data points, and a threshold slope; anda memory coupled to the processor and configured to provide the processor with instructions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for managing a storage system, comprising:
-
receiving storage system information from a deduplicating storage system, wherein the deduplicating storage system; breaks each of at least a subset of files in an input data stream into a plurality of segments; determines, for each of at least a subset of the plurality of segments, whether the segment has been stored previously; and stores a segment in the event that the segment is determined to have not been previously stored, wherein at least a subset of stored segments is used to reconstruct more than one of the subset of files in the input data stream; determining a capacity forecast for the deduplicating storage system based at least in part on the storage system information, wherein the storage system information comprises a cumulative compressed size, wherein the cumulative compressed size comprises a physical used space after deduplicating; and providing the capacity forecast for the deduplicating storage system, wherein providing a capacity forecast is based at least in part on a capacity forecast model validation criteria, wherein the capacity forecast model validation criteria comprise one or more of;
a threshold R squared value, a threshold number of standard deviations of data points from an expected value of a set of data points, and a threshold slope.
-
-
17. A computer program product for managing a storage system, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
-
receiving storage system information from a deduplicating storage system, wherein the deduplicating storage system; breaks each of at least a subset of files in an input data stream into a plurality of segments; determines, for each of at least a subset of the plurality of segments, whether the segment has been stored previously; and stores a segment in the event that the segment is determined to have not been previously stored, wherein at least a subset of stored segments is used to reconstruct more than one of the subset of files in the input data stream; determining a capacity forecast for the deduplicating storage system based at least in part on the storage system information, wherein the storage system information comprises a cumulative compressed size, wherein the cumulative compressed size comprises a physical used space after deduplicating; and providing the capacity forecast for the deduplicating storage system, wherein providing a capacity forecast is based at least in part on a capacity forecast model validation criteria, wherein the capacity forecast model validation criteria comprise one or more of;
a threshold R squared value, a threshold number of standard deviations of data points from an expected value of a set of data points, and a threshold slope.
-
Specification