METHOD AND SYSTEM FOR VALIDATING ENSEMBLE DEMAND FORECASTS
1. A system for assessing performance of demand forecasting models, the system comprising:
- a validation tool configured to;
receive validation sets and selections of configuration options; and
send submission packets to a validation server;
a validation server configured to;
query a demand forecast data store;
store validation sets in a data repository;
calibrate a forecasting model with historical training data from a test data repository;
test the forecasting model by calculating predictions for each of a plurality of sets of forecast coordinates within the validation sets;
save the calculated predictions in the test data repository;
calculate forecast validation results;
store validation results in a validation data repository; and
generate a validation user interface including visualizations of forecast performance.
Methods and systems for forecasting demand for a plurality of items are provided. In particular, the demand forecasting system and methods described herein are useful for predicting demand of products in a retail context. Forecast models are built and used to score incoming sales data to predict future demand for items. Forecast models are validated by evaluating actual demand against predicted demand and using that information to inform how future ensemble forecast will be generated. Forecasts may be broken down into smaller components to satisfy a variety of requests for data from client applications.
- 1. A system for assessing performance of demand forecasting models, the system comprising:
a validation tool configured to; receive validation sets and selections of configuration options; and send submission packets to a validation server; a validation server configured to; query a demand forecast data store; store validation sets in a data repository; calibrate a forecasting model with historical training data from a test data repository; test the forecasting model by calculating predictions for each of a plurality of sets of forecast coordinates within the validation sets; save the calculated predictions in the test data repository; calculate forecast validation results; store validation results in a validation data repository; and generate a validation user interface including visualizations of forecast performance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- 9. A method of validating and visualizing demand forecast models, the method comprising:
receiving a selection of a demand forecasting model to be validated, a validation set, and configuration options; and querying a demand forecast data store to retrieve a demand forecast corresponding to the selections; calibrating the demand forecasting model with historical training data; testing the forecasting model by calculating predictions for each of a plurality of sets of forecast coordinates within the validation set; calculating forecast validation results; and outputting visualizations of performance of the demand forecasting model for the selected configuration options on a validation user interface.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- 17. A graphical user interface usable to view and analyze results of demand forecast model validation, the user interface comprising:
a plurality of collapsible graphical sections comprising; a filter section configured to receive input to upload a validation, receive indications of tags, receive input of a date range, and receive selections at a drop-down list; a sets section configured to display a summary table including information about filtered validation sets, receive selection of one or two validation sets for inspection, and receive input to provide a visualization; an options section configured to receive selections of portions of the validation sets to inspect in the visualization, receive selections of metrics, and receive a selection of a type of visualization; a plot section configured to display the selected visualization for the selected portions of the validation sets; and a details section configured to display data used in the visualization in a table and receive input to export the table.
- View Dependent Claims (18, 19, 20)
The present disclosure relates generally to methods and systems for forecasting demand for items. More specifically, methods and systems are provided for generating demand forecasts for items based on past sales data and past demand forecasts.
Demand forecasting involves predicting future demand for products or services of a business or organization. Demand forecasting produces valuable information for businesses to use in production planning, inventory management, staff scheduling, and supply chain management. It is important to know how much inventory is needed to order and stock at various locations of a retail chain. Demand forecasting information can be useful not only for inventory management, but for scheduling personnel, planning marketing events, and budgetary planning.
Techniques for forecasting demand range from simply estimating demand based on past experience, which may be effective for smaller businesses or more predictable businesses. Demand can be calculated using a variety of statistical models and algorithms. Such models and algorithms typically rely on past data to predict future demand.
It can be difficult to accurately predict future demand for products, especially when taking into account seasonal changes in demand for particular products. This is further complicated for retailers offering a multitude of products, e.g. millions. There is a need for improved methods of forecasting demand for a large number of products taking into account seasonal changes in demand.
In summary, the present disclosure relates to methods and systems for forecasting item demand in a retail context. Various aspects are described in this disclosure, which include, but are not limited to, the following aspects.
In a first aspect, a system for assessing performance of demand forecasting models is disclosed. The system includes a validation tool configured to receive validation sets and selections of configuration options; and send submission packets to a validation server. The system also includes a validation server configured to: query a demand forecast data store; store validation sets in a data repository; calibrate a forecasting model with historical training data from a test data repository; test the forecasting model by calculating predictions for each of a plurality of sets of forecast coordinates within the validation sets; save the calculated predictions in the test data repository; calculate forecast validation results; store validation results in a validation data repository; and generate a validation user interface including visualizations of forecast performance.
In a second aspect, a method of validating and visualizing demand forecast models, is disclosed. The method includes receiving a selection of a demand forecasting model to be validated, a validation set, and configuration options, and querying a demand forecast data store to retrieve a demand forecast corresponding to the selections. The method also includes calibrating the demand forecasting model with historical training data, testing the forecasting model by calculating predictions for each of a plurality of sets of forecast coordinates within the validation set, calculating forecast validation results, and outputting visualizations of performance of the demand forecasting model for the selected configuration options on a validation user interface.
In a further aspect, a graphical user interface usable to view and analyze results of demand forecast model validation is disclosed. The graphical user interface includes a plurality of collapsible graphical sections. These include: a filter section configured to receive input to upload a validation, receive indications of tags, receive input of a date range, and receive selections at a drop-down list; a sets section configured to display a summary table including information about filtered validation sets, receive selection of one or two validation sets for inspection, and receive input to provide a visualization; an options section configured to receive selections of portions of the validation sets to inspect in the visualization, receive selections of metrics, and receive a selection of a type of visualization; a plot section configured to display the selected visualization for the selected portions of the validation sets; and a details section configured to display data used in the visualization in a table and receive input to export the table.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Various embodiments will be described in detail with reference to the drawings, wherein like reference numerals represent like parts and assemblies throughout the several views. Reference to various embodiments does not limit the scope of the claims attached hereto. Additionally, any examples set forth in this specification are not intended to be limiting and merely set forth some of the many possible embodiments for the appended claims.
In general, the present disclosure relates to methods and systems for forecasting demand for a plurality of items. In particular, the demand forecasting system and methods described herein are useful for predicting demand of products in a retail context. An ensemble forecast is compiled using component models that are combined and weighted to produce a consensus forecast for predicting item demand. Past forecasting performance is evaluated to determine which models are best used for particular sets of items. Models having superior performance for predicting item demand are weighted more heavily in the overall consensus forecast. Forecast models are validated by evaluating actual demand vs. predicted demand and using that information to inform how a future ensemble forecast will be generated. The present methods have an improved ability to capture seasonal effects on demand such as holiday sales or back to school sales. The systems and methods are scalable and customizable to different applications of the forecast data. For example, a forecast may be generated on a weekly basis at the chain level for a group of retail stores. However, the forecast can be broken down to an individual store or an individual date.
Referring now to
The mass storage device 214 is connected to the CPU 202 through a mass storage controller (not shown) connected to the system bus 222. The mass storage device 214 and its associated computer-readable storage media provide non-volatile, non-transitory data storage for the computing system 106. Although the description of computer-readable storage media contained herein refers to a mass storage device, such as a hard disk or solid state disk, it should be appreciated by those skilled in the art that computer-readable data storage media can include any available tangible, physical device or article of manufacture from which the CPU 402 can read data and/or instructions. In certain embodiments, the computer-readable storage media comprises entirely non-transitory media.
Computer-readable storage media include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable software instructions, data structures, program modules or other data. Example types of computer-readable data storage media include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROMs, digital versatile discs (“DVDs”), other optical storage media, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computing system 106.
According to various embodiments of the invention, the computing system 106 may operate in a networked environment using logical connections to remote network devices through a network 222, such as a wireless network, the Internet, or another type of network. The computing system 106 may connect to the network 222 through a network interface unit 404 connected to the system bus 422. It should be appreciated that the network interface unit 404 may also be utilized to connect to other types of networks and remote computing systems. The computing system 200 also includes an input/output controller 206 for receiving and processing input from a number of other devices, including a touch user interface display screen, or another type of input device. Similarly, the input/output controller 206 may provide output to a touch user interface display screen or other type of output device.
As mentioned briefly above, the mass storage device 214 and the RAM 210 of the computing system 106 can store software instructions and data. The software instructions include an operating system 418 suitable for controlling the operation of the computing system 106. The mass storage device 414 and/or the RAM 410 also store software instructions, that when executed by the CPU 402, cause the computing system 106 to provide the functionality discussed in this document. For example, the mass storage device 414 and/or the RAM 410 can store software instructions that, when executed by the CPU 402, cause the computing system 106 to receive and analyze inventory and demand data.
At operation 402, the common data preparation engine 302 receives and prepares past sales data and past demand forecasts. The common data preparation engine 302 receives and prepares both past data and incoming current data. The data can include sales activity as well as other data regarding attributes of items, stores, and locations.
The data is processed into a common format for use by the enterprise forecast generator 304. A more detailed view of a schematic of the common data preparation engine 302 is depicted in
At operation 404, a demand forecasting model is built. In some embodiments, forecasts are generated using a single model, such as a recurrent neural network (RNN) model. In other embodiments, an enterprise “consensus” model is built from combining two or more component models. Methods of building forecasting models are further described in
In some embodiments, the forecasting models are typically built using python or R software programming languages. The component models are generally selected from time series forecasting models such as recurrent neural network (RNN) models or Autoregressive integrated moving average models (ARIMA), seasonal trend decomposition by LOESS (STL). The component models are fed into a meta-forecaster to produce a compounded or consensus forecast. In some embodiments, weighting of the component models is done based on an affine function equation.
In example embodiments, the past performance of the component models is assessed for accuracy in forecasting. Accuracy alone is not the most important aspect but capturing seasonal effects can also be important. The models having the best performance for predicting demand are weighted more heavily and used in combination to predict the latest demand forecast. The weighted combination provides a more accurate representation of demand than any individual component model. The weighting of the models can change throughout the year based on, e.g., promotions or seasonality.
Returning to operation 406 of
At operation 408, an aggregate demand forecast is generated. The enterprise forecast engine 304 generates demand forecasts in batches by default. In one embodiment, forecasts are generated for each item, across all stores in a retail chain, every week.
At operation 410, the aggregate forecasts are stored in a data store. In the embodiment shown in
At operation 412, a request for a demand forecast is received from a client. The client request is received at a cloud platform 322 including one or more load balancers 324, as shown in
In the embodiment of
In the embodiment shown, each server 330 includes an API (application programming interface) 332, a CDF (cumulative distribution function) 334, and a disaggregation service 336. The API 332 operates to breakdown requests received from clients and communicate requests for data to a data source such as the forecasts data store 312. The API 332 is accessible by one or more client applications for use in various analyses for the retailer. In some embodiments, the API may be accessed through a user interface by an administrator computing device. An administrator user interacts with the user interface using inputs and outputs of the administrator computing device.
A query is received from a client at the API, e.g., as routed from the load balancer 324. The query includes one or more of an item or set of items, a location, a starting time and a time period; other parameters could be included as well, to the extent exposed by the API. For example, the query could be for a weekly forecast in September for all stores in Minnesota. Optionally, the API can also receive a selection of a particular model or model collection with which to forecast demand. Once the API receives the client request it is broken down to determine which data is needed to satisfy the query. The query service 318 is accessed which then accesses the forecasts data store 312 to retrieve the chain level forecast or “aggregate demand forecast.” The CDF 334 breaks down the forecast to a distribution. Then the API 332 communicates the requested demand forecast back to the client application. In some embodiments, this involves presenting the requested demand forecast data on a user interface.
The API (application programming interface) 332 operates to receive client requests in real-time. Each API 332 responds to clients 326 on a per-request basis. In some embodiments, the API 332 communicates with the forecasts data store 312 through a resource manager 314. Each resource manager 314 includes an application master 316 and a query service 318. The query service 318 retrieves the requested data from the forecasts data store 312.
In some embodiments, the resource manager 314 may be built using open-source software such as Apache SLIDER. The cloud platform 322 may be built using Apache OpenStack. Returning to the server 330, each application or service within the server 330 may be packaged into a docker container using an Apache Tomcat service.
At operation 416, the aggregate demand forecast is disaggregated, if needed. This can be accomplished via a disaggregation service 336 available at each server 330. The disaggregation service 336 operates to break down aggregate demand forecasts retrieved from the forecasts data store 312 into smaller units of location or time, depending on the client request. Not every request will require disaggregation, but the API processes the request to determine if disaggregation is needed to properly respond to the request. The disaggregation service 336 is further described in relation to
At operation 418, the forecast is converted to a distribution by the CDF 334 if needed.
Accordingly, as seen in
At operation 420, the demand forecast is output to the client in response to the client'"'"'s request. In some embodiments, the client is an administrator user interface and the demand forecast is visualized for viewing, for example, by an administrator user.
The common data preparation engine 302 includes a memory 502 in communication with a processor 504. The memory 502 includes software applications including a data preparation application 512. The memory also includes data stores or databases including a standard data store 514.
The data preparation application 512, operates to receive data from a retailer that may include catalog data, location data, inventory data, promotion data, planogram data, web sales data, and store sales data. In some embodiments, this data may first be gathered at a server system such as the retailer server system 102 of
The data preparation application 512 may receive data in a variety of formats. Before being accessed by the forecast generator 304, the data received at the common data preparation engine 302 may need to be reformatted for use by the forecast generator 304. The data preparation application 512 operates to standardize the data received so that it may be used by a variety of forecasting models at the forecast generator 304. Preferably, the data signals are converted to a format that has the right balance between standardization and flexibility, as different models utilize data in different ways. The data preparation application 512 overcomes challenges of both scale and flexibility. For retailers having a multitude of retail locations and items, it can be difficult to manage massive amounts of data in different formats. The data preparation application 512 also overcomes challenges relating to providing a common view of data for all forecasting models that may access the data for processing.
In some embodiments, the common data preparation engine 302 saves the processed data from the data preparation application 512 in a standard data store 514. In some embodiments, the standard data store 514 stores the standardized data utilizing data warehouse software that provides capabilities for data summarization, theory, and analysis. One example of such data warehouse software is Apache Hive.
The common data preparation engine 302 processes both building and scoring data. Building data includes past data that is used for building demand forecasting models. Store history and web history are processed at the common data preparation engine 302 for later access by the forecast generator 304 for training various types of demand forecasting models.
Data that is currently being received from store and web sales are processed at the common data prep engine 302 and are forwarded to the enterprise forecast generator 304 where that data is used to generate store forecasts and web forecasts. This data is used for scoring or predicting future demand. The common data preparation engine 302 processes incoming data incrementally. Data stores in the standard data store 514 are updated as new data is received instead of processing all data each time new data is received. The standard data store 514 compiles received and processed data into Hive data tables for later access by the forecast generator. Other examples of data that may be received from the internal systems of the retailer include price data, characteristics of retail stores, and calendar event such as holidays.
The enterprise forecast engine 304 includes a models database, a model selection and validation engine, day forecast generator, a memory, and a processor.
The enterprise forecast engine 304 utilizes one or more models to analyze data received from the common data preparation engine 302. In one embodiment, the enterprise forecast engine 304 utilizes one main model based on a recurring neural-network (RNN). In some cases, supplemental models such as an ARIMA or LOESS model can be used in conjunction with the RNN model to accommodate for changes in demand caused by seasonality, holidays, or other variations in demand.
In other embodiments, a meta forecaster makes consensus forecasts for item demands based on an ensemble of component models that are weighted to produce an ensemble model. An ensemble model is constructed around a linear regression of actual demand on demand forecasts produced by component models used to build the ensemble model. Linear regression has the advantages of simplicity, diagnose ability, and familiarity. It may be adapted to emulate other approaches, such as variance weighted combinations or stacked regressions.
In some embodiments, the demand forecasting system utilizes an enterprise forecast engine that utilizes an ensemble of component models to predict future demand for items. The enterprise forecast engine calculates a linear regression of actual demand on demand forecasts produced by the component models in the ensemble. Linear regression has the advantages of simplicity, diagnosability and familiarity. It may be adapted to emulate other approaches, such as variance-weighted combinations or stacked regressions. Linear regression can be extended to implement more elaborate ensembling approaches.
The ensemble model utilizes a weighted combination of two or more component models producing a more accurate representation of demand than any individual component model. The weighted combinations of models are adjusted over time, increasing accuracy as more data is analyzed.
At operation 602, parameters for a demand forecast are received at a forecast generator. Parameters can include one or more of a group of items, a time period, a location, and other attributes. For example, parameters may dictate whether the items to be analyzed are on promotion or off promotion. In another example, the time of year could be during a particular season or not.
At operation 604, component models are selected based on past performance. Add operation 606, the component models are weighted based on the past performance of the component models for the selected parameters. A forecast validation engine 306 evaluates models for their ability to predict demand for items for given time periods and locations. The demand forecasting models are being evaluated on a continuous basis using new data from the retailer as it is generated. Therefore, the forecasting models are also being continually evaluated and updated. As new data is being ingested and analyzed, the overall ensemble model may be updated by choosing different component models or modifying the weighting of the different component models.
In some embodiments, the demand forecaster utilizes a single model. The forecasting model may be supplemented by one or more component models to accommodate for seasonality or promotions. When a single model is implemented by the demand forecaster, a recurring neural network (RNN) is utilized. Supplemental models can include ARIMA, LOESS or STL (seasonal decomposition).
Seasonal demand can be difficult to predict. One example of a component model that is useful for computing seasonal item demand is a wavelet decomposition model WD. The wavelet decomposition model uses wavelet functions e.g. Haar, Symmlet, Daubechies, etc. To decompose time series data into approximated and detail coefficients. Multi-level wavelet decomposition is performed on time series data to find approximation and detail coefficients, the number of decomposition level depends on the length of the time series data. The maximum decomposition level approximation coefficient is used to reconstruct the time series and the reconstructed time series data will be used as trend. Then the actual time series data is detrended using the trend values found. Seasonal indices are calculated by weighted average based on variance using respective weeks indices from trended data.
Another example of a component model useful for computing seasonal item demand uses a combination of spline and GBM decomposition methods. In some instances, trend estimates can be found by fitting a smoothing spline to the time series. In such instances, a smoothing parameter is set to a lower bound in order to avoid overfitting and is determined by cross validation. The time series is detrended using the trend values found. The seasonal indices are found by fitting a GBM to the detrended data.
Algorithms for Ensemble Modeling
The following discusses methods and systems used to generate linear regression-based ensembles that may be implemented by the enterprise forecast engine 304. The methods and systems as implemented herein can use one or more algorithms, discussed below, for implementing such ensemble models. Table 1 lists definitions of terms used in the calculations.
Let yt be the actual sales of a given item in week t. Assume that there are I models that provide forecasts for yt; the forecast with horizon h (i.e. for period t+h) made by model I in period t is denoted ŷth(i). The ensemble forecast, ŷth, is an affine function of the model forecasts:
Here the coefficients βiht are in turn affine functions of features of the forecast period. The value of feature j in week t, xjt, might represent a promotion in period t, for example. Having the coefficients in the ensemble depend on such features accommodates a component model that is more or less accurate in periods of promotion, say. Therefore:
The coefficients in equation (1) and equation (2) can be estimated by means of a hierarchical regression:
Substituting equation (4) into equation (3):
Combining the noise terms, and relabeling the coefficients produces an equivalent estimator for yt+h;
Similarly, the ensemble forecast becomes:
To estimate the coefficients in equation (5) for a given item and horizon h:2
1. For all t such that we have a record of actual sales in period t+h, gather the available model forecasts ŷth(i), where I indexes those models that made a forecast for the stipulated item with horizon h in period t.
2. For each tin the previous step, locate actual sales yt+h and all features xj,t+h.
3. Run the regression in equation (5), to yield estimates for the coefficients λh, λi and λhij.
With coefficient estimates from the procedure above, to produce an ensemble forecast for a particular item in week t for week t+h (i.e. with horizon h all available model forecasts are loaded for the item made in the current week for week t+h, i.e. ŷth(i). The ensemble forecast is calculated using the affine function defined in equation (6), above.
Forecast Validation Engine
The forecast validation engine 306 operates to evaluate and validate models built to forecast demand. The forecast validation engine 306 runs multiple forecasts for the same set of sales data and store the results in a table. Metrics are run on the table to cross-validate distributions. The metrics may be entered by an administrator user or selected from a menu.
The validation server 702 includes a data repository 710 and a validation user interface 712. The data repository 710 stores forecast validation results obtained from forecasting models used in the demand forecasting system 108. The validation user interface 712 is accessible from an administrator computing device such as the administrator computing device 106 of
The command line tool 704 operates to receive uploaded validation sets and prepare them for the validation server 702.
At operation 802, a validation set and accompanying configuration options are received at the command line tool 704. An administrator user may upload the validation set through interactions with the validation user interface 712. The validation set is a data set of forecasted values calculated by the forecasting model to be validated. Configuration options include identifying tags, and selecting metrics to be examined.
At operation 804, a submission packet including the validation set, selected configuration options, and identifying information for the data set is sent to the validation server 802. Upon receipt of the submission packet, the validation server 702 carries out a series of checks to ensure that the submission packet has the information required for the validation server 802 to perform its analysis.
At operation 806, the validation server 702 queries the forecasts data store 312 to retrieve the forecast data generated by the forecasting model being validated. The validation set is stored in the data repository 810.
At operation 808, the model is calibrated using historical training data 716 from the test directory 706. The historical training data 716 includes values of the forecast quantity. In some embodiments, the test directory 706 also stores ancillary data that may be relevant to the model such as item features, holiday information, seasonal information, etc.
At operation 810, the model is tested by calculating predictions for each set of forecast coordinates. The calculated predictions for the set of forecast coordinates can be compared to known data or to other forecasts to determine differences between such results (e.g., to determine outliers or variance outside a threshold.
At operation 812, the predicted values 718 are saved in the test directory 706. The predicted values can be saved in various forms, including in a database, or any other convenient file format useable for analysis.
At operation 814, the forecast validation results are calculated at the validation server 702 and are stored in the data repository 710. Forecast validation results can be, as noted above,
At operation 816, visualizations of forecast performance are displayed on the validation user interface 712. The validation user interface 712 allows an administrator user to select various options for viewing and comparing validation results. Examples of visualizations include box plots, Q-Q plots, histograms, and CDF plots.
The forecast validation engine 306 includes a validation user-interface 712.
A more detailed view of the options section 906 is displayed in
Sales data for individual stores on individual days can be very noisy because for any given item, the sales may be very low. If the number of sales over a particular time interval is too small, estimating the underlying rate of sales is extremely difficult, and the noise properties do not satisfy the requirements of many regression and machine learning techniques used for forecasting. For most items, aggregation at some level is necessary, especially for slow-selling items, in order to ensure sufficient counts within each location/time/item to satisfy the underlying model assumptions.
Another problem with forecasting based on individual item, at individual store, on individual day is that it would take much more storage space to store the demand forecasts generated for that level of granularity. It is advantageous to aggregate the data when the number of items being offered by the retail are very high so that data storage space can be conserved.
Sales can be aggregated based on location (useful to determine high precision in time), time (useful to determine high location-level precision), and collection of items (useful as a prior for new items). In one embodiment, sales are aggregated to generate forecasts for each item for a week across all stores in the retail chain.
Given an aggregate chain-level forecast, the disaggregation service estimates contributions from individual stores. To estimate the count rates at a high granularity among the dimensions of item, location, and day, the behavior over multiple aggregation dimensions is measured and interpolated. For example, the relative sales rates of an item for each store can be determined aggregating the sales over several months, which can then be combined with item/day/chain level forecasts (aggregated over all stores) to estimate the forecast for each individual store.
The disaggregation service can also measures how noise properties of counting discrete occurrences (sales) vary with the aggregate signal. Contributions of individual entities can be estimated, even if that contribution is too small to be detected directly.
At operation 1502, a client request is received and processed. In some embodiments, the server 330 receives the client request from the load balancer 324. The API 332 then processes the request to determine which data is needed to satisfy the request, and whether that data needs to be transformed in any way. The API 332 can determine which aggregate demand forecast to request. The API 332 also determines if the forecast requires disaggregation. In the example method of
At operation 1504, the API 332 submits a query for the appropriate aggregate demand forecast. In the example of
At operation 1506, the forecasted ensemble mean is calculated by the disaggregation service 340.
At operation 1508, the disaggregation service 340 determines the ensemble variance.
At operation 1510, the sales intensities per store (SIF) is estimated by the disaggregation service 336.
At operation 1512, the disaggregation service 336 determines the relative sales efficiency of the store or stores at issue compared to all other stores in the retail chain, for a given item.
At operation 1514, the disaggregated demand forecast is output to the client. The API communicates the demand forecast to the client. In some embodiments, the client accesses the forecast through a user interface.
The store Sales Intensity Function (SIF) provides a distribution of the expected values or average sales per unit time amongst all stores selling an item. The Sales Count Distribution (SCD) provides a distribution of the discrete sales counts per unit time across all stores selling an item. The Ensemble Variance Function (EVF) provides an empirical relation between the ensemble (aggregated over stores) mean and the ensemble variance.
The disaggregation methods operate on the assumption that the sales counts are poisson distributed (SPF follows a Poisson distribution) at item-store-day level. The store sales intensity function (SIF) is assumed to be a Gamma distribution and the SCD is assumed to be a Negative Binomial distribution.
The store sales intensity function (SIF) provides a distribution of the expected values or average sales per unit time amongst all stores selling an item. This calculation gives the number of stores per sales intensity interval for a given item. The SIF is used to determine the relative sales performance of each store for an item. The SIF is well-fit by a gamma distribution if a sufficient number of stores are selling the item. SIF shows the spread in sales performance across stores and can be used to compare individual store sales performance to the rest of the retail chain.
The discrete aggregated store sales count distribution (SCD) provides a distribution of the discrete sales counts at a given time across all stores selling an item. The SCD gives the number of stores per sales count number for a given item. The SCD is well-fit by a negative binomial distribution. SCD can be directly measured and fit on historical data, or estimated from aggregate or ensemble parameters.
In large retail chains, the number of store selling an item at any time can change, so aggregating the sum of sales will not be sufficient to provide an accurate forecast. To preserve information of how sales of each item are distributed across stores even after aggregating, the value counts of unit sales for each item-day are measured (i.e. how many stores sold N units of each item on each day). This probability mass function is the sales count distribution (SCD), which gives the number of stores per sales count number for a given item and time range. Beyond simply measuring a single aggregate statistic (like mean sales across all stores), the SCD provides a much fuller pictures of how the sales are distributed amongst the stores.
Across both time and item dimensions, the SCD at day-level and higher is well-fit by a negative binomial (NB) distribution.
Ensemble Variance Functions (EVF) provide an empiricial relation between the ensemble (aggregated over stores) mean and the ensemble variance. If direct measurement or forecast of the ensemble variance is not available, an empirical relation between the ensemble mean and ensemble variance can be used to estimate. The ensemble variance across stores is highly correlated with the ensemble mean and is well-fit by a power-law
The simplest disaggregation method is to just equally allocate the sales amongst the stores. This is the equivalent of each store having an instantaneous sales intensity equal to the average chain-level aggregate sales. Equal allocation primarily serves as a good baseline of comparison for other disaggregation methods.
A slightly more complex method is to use the fractional contribution of each individual location aggregated over time to provide a simple disaggregation mechanism. This method assumes that the relative contribution of each location is constant over time.
The negative binomial/gamma-poisson mixture is motivated by the empirical observations that the time-averaged SIF is well-fit by a gamma distribution and the SCd is well-fit by a negative binomial distribution. The negative binomial distribution can arise from a continuous mixture of Poisson distribution (i.e. a compound probability distribution) where the mixing distribution of the Poisson rate is a gamma distribution. This method assumes that aggregate sales counts across all stores per unit (SCD) is negative binomial distributed, the sales intensities per store (SIF) is gamma distributed, and the sales per unit time for an individual store (SPF) is Poisson distributed. Overall, the assumption is that the relative sales rates between stores for a particular item is stable over time.
Overall, instantaneous sales intensities for each location can be inferred from a combination of the instantaneous SIF and a known rank-order of each store'"'"'s performance. While the SIF cannot be directly measured from the sales counts themselves due to too few counts, the relationship between the aggregate probability distributions can be leveraged to estimate the SIF parameters from the SCD parameters. Further, if the SCD is a negative binomial, then its parameters can be estimated directly from the ensemble mean and ensemble variance. Thus, a chain-level forecast can be disaggregated to the store-level given the forecasts ensemble mean, the ensemble variance, and the relative sales efficiency of each store.
Methods and systems of the present disclosure provide advantages over prior systems and methods for predicting item demand in every retail environment. One such advantage is the ability to provide real time updates to demand forecasting models and the demand forecasts that are produced by those models. The models and forecasts are updated as new data is received from the retailer such as new sales data from both retail store locations and web sales. Another advantage of the current system and methods is that the demand forecasts are scalable to accommodate various uses for the demand forecast data. For example, demand forecast data can be used to predict demand for items in a particular retail store on a particular date or demand can be calculated for an entire chain of retail stores for a given month. The system described herein can handle a massive amount of items being offered for sale (e.g. millions). However, the demand forecasts can also be customized for various levels of granularity depending on the client application requesting the demand forecast and the needs of that client location.
The present systems and methods provide a novel approach to solving the problem of accurately predicting demand for particular items within a retail context. The use of weighted component models to generate an ensemble demand forecasting model is novel and advantageous over prior art methods because it allows for flexibility throughout a long time period such as a year, to accommodate changes in demand that occur due to seasonality, holidays, and promotions.
Overall, the day-to-day accuracy of an ensemble forecasting model is not as important as predicting seasonal demand for items. For example, with school supplies, it is more important to predict general trends in item demand for school supplies for the back-to-school season then it is to accurately predict demand for school supplies items on a day-to-day basis throughout the entire year. This is because, for many retailers, school supplies are sold in the greatest quantities during the “back to school” season, or the months of August and September.
Due to the flexibility provided by the ensemble forecasting model approach, the demand forecasting systems and methods of the present disclosure are able to more accurately predict demand for items over the course of a year or a longer time period, taking into account changes in demand for seasonal items. Because so many items within a retail context have changing demand based on seasonality, whether the items are on promotion, or whether the items are relevant to a particular holiday, it is important to be able to take into account seasonal effects. The ensemble model comprising weighted component models is advantageous in that the weighting of the various models can be modified to take into account changes for demand throughout the year based on seasonal effects.
The presently disclosed methods and systems go beyond merely predicting how many units of each item customers are likely to buy. In the context of retailers having multiple retail store locations and a web based business, there are millions of opportunities to make sales to customers. As a result, computational methods are required to optimize item offerings to customers and to ensure that items are being stocked both in warehouses and at retail stores that are most likely to be in demand by customers. The present disclosure describes computational methods for analyzing both past sales data and currently occurring sales data to determine which items are going to be in the greatest demand for a given time period and a given location so that an overall retail system can position items and personnel such that customer demand will be met with the proper resources.
The description and illustration of one or more embodiments provided in this application are not intended to limit or restrict the scope of the invention as claimed in any way. The embodiments, examples, and details provided in this application are considered sufficient to convey possession and enable others to make and use the best mode of claimed invention. The claimed invention should not be construed as being limited to any embodiment, example, or detail provided in this application. Regardless of whether shown and described in combination or separately, the various features (both structural and methodological) are intended to be selectively included or omitted to produce an embodiment with a particular set of features. Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate embodiments falling within the spirit of the broader aspects of the general inventive concept embodied in this application that do not depart from the broader scope of the claimed invention.