Strategy, Technical06/16/2021

Modern Machine Learning-based Approaches to Inventory Forecasting

Jacob Dink

Optimizing inventory is a key challenge for organizations. Under-prepare for the future, and you can wind up with irate customers who frustratedly move on to a different supplier. Over-prepare, and you’ve got a surfeit of stuff and a deficit of resources. Inventory optimization is, in short, the art of avoiding either situation.

But just as there’s no single type of inventory — whether it’s shoes, medical supplies, vehicles on the road, or human resources — forecasting to optimize that inventory doesn’t involve a one-size-fits-all approach. At Strong we often find the need to go beyond standard approaches to time-series forecasting. Let’s take a look at three different approaches we’ve used: hierarchical forecasting, multivariate forecasting, and hybrid forecasting.

Hierarchical Forecasting

Hierarchical forecasting can be used in scenarios with nested time series that together add up to a coherent whole — that is, when your separate time series are hierarchical. An entire country’s tourism inflow, for instance, is the aggregate of many sub-national levels: the country can be broken into provinces, the provinces into territories, the territories into municipalities, and so on, and each level can be aggregated or disaggregated into the level above or below it, respectively. Of course, this sort of hierarchical structure is common to organizations of all kinds, from human resources (companies are broken into regions, regions into departments, departments into teams) to product inventories (an entire company inventory can be refined to regions, stores within those regions, product categories within those stores, etc.).

The advantage of hierarchical forecasting is that information from different levels of the hierarchy can be used to add information to other levels to more accurately forecast an individual time series than if any given level were forecast on its own. For example, we used hierarchical forecasting to predict admissions and discharges for a large hospital system made up of multiple facilities, departments, and service-lines. Trying to forecast at a single node in the lowest level of the hierarchy — say, the physical therapy department at a particular facility — is difficult to do reliably because of the scant amount of data for that node, since the department may only see a few dozen patients a year. There’s a lot of noise, and not a lot of signal.

But looking at the upper levels of the hierarchy, such as the hospital as a whole, can clarify what’s happening in the physical therapy department: if there’s a seasonal uptick in the facility that the department is housed in as well as in the system at large, those trends with higher signal-to-noise ratios can be extrapolated down to the specific department for a more reliable, less noisy forecast.

Hierarchical forecasting is less computationally demanding than other methods, as the complexity of computation doesn’t increase as levels and nodes within the system proliferate: computational load increases linearly, not exponentially. This makes it useful for organizations whose hierarchies contain hundreds or thousands of nodes across many levels, which is not unusual in inventory optimization.

Multivariate Forecasting

In other scenarios, the relationships between different time series don’t form a vertical hierarchy, but rather something more horizontally egalitarian. In such cases, multivariate forecasting allows each time series to add information to the forecast of all the others.

We used multivariate forecasting to help a blood supplier to hospitals and blood banks predict the availability of each of the nine blood types throughout the year. Unlike our forecast for hospital staffing, the relationships between blood types are not hierarchical, as different blood types cannot be aggregated to forecast the supply of another blood type — types A and B don’t add up to type O-. However, like the hospital staffing project, the signal-to-noise ratio for any individual time series is low enough that information is needed from other time series to build a reliable model.

That means that accurately forecasting one blood type is aided by forecasting all the other blood types at the same time: predicting the supply of type A blood helps to more accurately forecast type B, and in turn, predicting type B boosts the accuracy of our forecast for type A, throughout all nine types. The idea of multivariate forecasting is that by making separate noisy time series collude together, the noise of any individual given time series is reduced.

Multivariate forecasting, however, can come at a steep computational cost if you’re crunching numerous time series. That’s because the addition of each time series adds a new set of pairwise relationships: A multivariate model to forecast demand from demographics A and B involves one pair, but doing so for A, B, C, and D involves six. This means that while a multivariate approach can be ideal for scenarios with a manageable number of variables, it can become prohibitively costly or slow when trying to apply it to high-dimensional data — while our 8-dimensional model for blood demand is perfectly feasible, applying the same approach to 160 hospital departments can cause the model to lose its breath.

Hybrid Forecasting

Traditional forecasting methods, like exponential smoothing, are tried and proven in a variety of applications, often beating out more sophisticated deep-learning approaches. However, in recent years, hybrid approaches to forecasting — in which parts of the model use more traditional forecasting approaches and parts of the model use black-box neural networks — have proven highly successful.

Hybrid forecasting can be useful in many real-world time-series involving complex sub-daily patterns. These data can be challenging for traditional forecasting models because these models treat each of these relationships as siloed processes that together add up to the overall behavior of the series — that is, the difference between 5am and 5pm on Wednesday is treated as similar to whether it’s 5am or 5pm on Saturday. However, in many settings, these processes all interact: the impact of hour-in-day depends on the day-of-week, the impact of the day-of-week depends on the season of the year, and so on.

For example, we used a hybrid approach to forecast energy usage across a diverse range of resources in a metropolitan area. In our case, a neural net was joined with a state-space model. By processing the times series in the deep learning model first, we were able to more accurately represent the interactive relationships that traditional forecasting models struggle with, such as knowing how hours, days, and weeks relate rather than looking at each agnostically to the other — by training the model on a large, diverse set of buildings, the neural network was able to efficiently capture patterns that were common to all of them. And by running the results through the state-space model, we can capture deviations from the norm and adjust our forecast accordingly — for example, by predicting that if energy usage is unusually high on day 102, that it might also be higher than normal on day 103, day 104, and so on.

The drawbacks of hybrid forecasting are those that come with working with neural nets generally: complexity, cost, the need for lots of data and rigorous preprocessing, and a model that is less easily interpreted than less black-box methods. In our experience, however, the benefits often outweigh these drawbacks and they serve as an important component of the inventory optimization toolkit.

At Strong, we recently open-sourced a Python package, torchcast, that supports the approaches described in this post: the package has robust support for multivariate forecasting, and is built on top of PyTorch so that any aspect of your time-series model can be integrated with any arbitrary neural network that fits the task at hand. Contact us to learn more and find out how we can help you leverage these state-of-the-art techniques to improve inventory optimization, budgeting, and other applications where forecasting can play a central role.


Strong Analytics builds enterprise-grade data science, machine learning, and AI to power the next generation of products and solutions. Our team of full-stack data scientists and engineers accelerate innovation through their development expertise, scientific rigor, and deep knowledge of state-of-the-art techniques. We work with innovative organizations of all sizes, from startups to Fortune 500 companies. Come introduce yourself on Twitter or LinkedIn, or tell us about your data science needs.