Computer storage capacity forecasting system using cluster-based seasonality analysis
First Claim
1. A system for generating a forecasting model for storage capacity planning, the system comprising:
- a computer system comprising computer hardware, the computer system programmed to implement;
a data pattern analysis engine configured to;
receive time series data reflecting computer storage capacity;
identify a trend in the received time series data, using the received data, said identifying the trend comprising performing a transform of the time series data to derive a parameter, using the parameter to calculate a goodness-of-fit for a selected trend, and selecting the trend based at least partly on the calculated goodness-of-fit;
perform cluster-based seasonality analysis to identify seasonality in the received time series data, said performing comprising analyzing intervals between consecutive events to detect the seasonality; and
a forecast engine configured to select a forecast model adapted to predict an amount of additional computer storage capacity to obtain in the future and when to obtain the additional computer storage capacity, the forecast model operative to select the forecast model based at least partly on the identified trend and seasonality.
27 Assignments
0 Petitions
Accused Products
Abstract
A methodology for automatic a priori data pattern analysis is provided. Described methods allow consistent and objective determination of outliers; trend; seasonality; and level shifts; and the production of better models and more accurate forecasts. In addition, a two-step way to automatically determine seasonality and locate possible events in the data set is described. Decomposition of data into seasonal, trend and level components; detection of outliers and level-shift events in the time series based on statistical analysis of the time series; detection of seasonality based on statistical analysis of clusters of data, known as cluster-based seasonality analysis, or CBSA; evaluation of the goodness of fit of a model to data, using the existing goodness of fit indicator, R2; and seasonality analysis, using a sequence of cluster-based seasonality analysis (CBSA) and Fourier analysis are described.
-
Citations
20 Claims
-
1. A system for generating a forecasting model for storage capacity planning, the system comprising:
a computer system comprising computer hardware, the computer system programmed to implement; a data pattern analysis engine configured to; receive time series data reflecting computer storage capacity; identify a trend in the received time series data, using the received data, said identifying the trend comprising performing a transform of the time series data to derive a parameter, using the parameter to calculate a goodness-of-fit for a selected trend, and selecting the trend based at least partly on the calculated goodness-of-fit; perform cluster-based seasonality analysis to identify seasonality in the received time series data, said performing comprising analyzing intervals between consecutive events to detect the seasonality; and a forecast engine configured to select a forecast model adapted to predict an amount of additional computer storage capacity to obtain in the future and when to obtain the additional computer storage capacity, the forecast model operative to select the forecast model based at least partly on the identified trend and seasonality. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A method for detecting seasonality in time series data, the method comprising:
by a computer system comprising computer hardware; receiving time series data reflecting computer storage capacity; identifying a plurality of events in the time series data, the events comprising one or more of outlier points and level shifts; performing an interpolation of the outlier points by at least substituting a value of each outlier point with a value calculated using linear interpolation; identifying a trend in the received time series data; performing cluster-based seasonality analysis to identify a first seasonality in the received time series data, said performing comprising analyzing intervals between consecutive events to detect clusters of data points and analyzing the clusters to identify the first seasonality; and generating a forecast model based at least in part on the identified trend and first seasonality, the forecast model adapted to predict an amount of additional computer storage capacity to obtain in the future and when to obtain the additional computer storage capacity. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A method for data forecasting, the method comprising:
by a computer system comprising computer hardware; receiving time series data reflecting computer storage capacity; and generating a forecasting model from the time series data, the forecasting model adapted to predict an amount of additional computer storage capacity to obtain in the future and when to obtain the additional computer storage capacity, said generating the forecasting model comprising; identifying a plurality of events in the time series data, the events comprising outlier points, wherein identifying the outlier points comprises calculating a standard deviation and running mean of the time series data, determining whether any data point is within a number of standard deviations from the running mean, and considering each data point that is greater than the number of standard deviations of the running mean to be an outlier point; performing an interpolation of the set of outlier points by at least substituting a value of each outlier point with a value calculated using linear interpolation; identifying a trend in the received time series data, using the received data and the interpolation, said identifying the trend comprising performing a transform of the time series data to derive a parameter, using the parameter to calculate a goodness-of-fit for a selected trend, and selecting the trend based at least partly on the calculated goodness-of-fit; performing cluster-based seasonality analysis to identify seasonality in the received time series data, said performing comprising analyzing intervals between consecutive events to detect the seasonality; and creating the forecasting model based at least in part on the identified trend and seasonality. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
Specification