Only Happened Once: The Risks of Drawing on the Past to Forecast the Future

As forecasters, we believe that the past is far and away the best tool to predict future events - but history doesn't always repeat itself. When taking past data and extrapolating future events, there is a high risk of mis-identifying whether an event will re-occur and how likely that outcome will be.

Sustained Competitive Advantage

"Big data is playing an integral role in the FP&A market."

With the digital data revolution, our ability to retain massive volumes of data and then aggregate, analyze and program forecast models is unparalleled in history. Already big data is playing an integral role in the FP&A market, with global accounting and consulting firm KPMG International declaring that analytics can offer unique insight like never before.

"Companies now capture all available data, explore it, and disaggregate it down to the areas that allow them to understand and grow their business," wrote the authors of the 2013 executive summary. "This use of data analytics can provide a sustained competitive advantage."

But even this optimism must be tempered by reality. Drawing on historical data in a meaningful way requires taking the long view, with higher volumes typically revealing slower rates of change. And certain events that have occurred can actually throw a monkey wrench into a forecast if not recognized for what they likely were: one time outliers.

Outliers Aren't Predictive

In their way, outliers stand in contrast to the kind of possible (but as yet not occurred) events we have previously spoken about. Outliers could result from a one time economic, social,  political, or personal event and can throw off a forecast if assumed to be predictive. Changes in system behavior, fraudulent or criminal actions, human or instrument error or simply natural deviations in populations can lead forecasters to think that a clear and defined event has more meaning than it should, portending something that it doesn't.

Making things even trickier is that these events can arise out of predictable and commonplace actions, leading them to seem more natural and portentous than they actually are. For a company like Wal-Mart - armed with years of data - credit card fraud on a broader scale can be predicted based on past data and trends. Yet even with this data giving forecasters a sense of how likely it is that fraud will occur, resulting in a probability ratio, the forecaster cannot assume the past will replicate itself. So how do you identify an outlier?

Baselines and Larger Data Sets

Rather, the goal is to take this data and inform modeling assumptions as a baseline. From there, look at the baseline and identify the  spike - say a sudden surge in credit card fraud - and assess whether it was spurred by a hacker compromising payment systems - in order to recognize its context in larger operational trends. Note however, that taking a very granular view of events may cause one to lose the broader context: in this example, hacking is a constant problem, even though the specific hacker or method is 'one-time'.  Often, these outliers are easily identifiable by their scale and severity - and it's in this regard that they can throw off forecasts. A significant event, once added to a forecast, can have a major impact, skewing it in the wrong direction.

"The goal is to take this trend data and inform modeling assumptions as a baseline."

Luckily, larger data sets combined with more sophisticated algorithms can identify these spikes more easily. A perfect example is the rise of medical databases worldwide. With the global scale and larger volume of data, identifying meaningful trends in disease outbreaks and treatment efficacy is now possible in a way that was inconceivable prior to the widespread use of centralized and accessible data repositories.

Driving Towards the Unknown

The final example we have to show how the past may not repeat involves an outlier that will  impact the future: driverless cars. Car manufacturers, insurance providers, and even regulators working in the transportation sphere, have plenty of data on automotive transport to build forecasts. But with widely available driverless cars on the horizon, the entire forecast may be rendered moot. Adjusting for peaking demand becomes more difficult as the marketplace changes entirely.

Forecasts are not intended to be 100 percent accurate, but rather estimate "what is reasonable," and enable the organization to foresee, bear, and adjust for unforeseen events. We must be able to recognize statistically irrelevant outliers as well as the forerunners of real change, adjusting the forecast to accept these change components in the model. This spurs a questioning of our historical assumptions, with the forecaster asking, How are my assumptions flawed? Or, How could they be changed by trends, demand and environment?