Expectations vs. Realities: The Cost of MSE-Optimal Forecasting Under Conditional Uncertainty
2026-06-03 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors studied how we judge the accuracy of predictions for multiple future steps in time series data. They found that using only average errors (like MSE) can be misleading because it ignores uncertainty and variability at longer time horizons. This means no single method can both be perfectly accurate and represent the range of possible outcomes well. Their tests on synthetic and real data show a clear trade-off: methods that are slightly less accurate by MSE often give more realistic predictions about future variability. They also find different forecasting approaches naturally fall on different points along this accuracy-versus-realism spectrum.
Multi-step time series forecastingMean squared error (MSE)Conditional uncertaintyForecast horizonDeterministic predictorMarginal distributionPareto frontRecursive forecastingMulti-output predictorsSample-based inference
Authors
Riku Green, Zahraa S. Abdallah, Telmo M Silva Filho
Abstract
Multi-step time series forecasting (MSF) is commonly evaluated using point-wise error metrics such as mean squared error (MSE), implicitly treating the conditional mean as a sufficient target. We show that this can be misleading under conditional uncertainty, where the conditional expectation becomes unrepresentative of typical realized values at longer horizons. We formalize this effect through a conditional uncertainty gap and prove that whenever this gap is nonzero, no deterministic predictor can simultaneously minimize MSE and match the marginal distribution of realized futures. This establishes a fundamental, model-agnostic trade-off between point accuracy and marginal realism in MSF evaluation. Using controlled stochastic dynamical systems and nine real-world forecasting benchmarks, we empirically characterize the resulting accuracy--realism frontier and \textbf{quantify the practical cost of MSE-only model selection}. As conditional uncertainty increases with forecast horizon, the attainable set expands into a pronounced Pareto front, separating MSE-optimal but under-dispersed predictors from methods that trade accuracy for realistic marginal variability. \textbf{Across benchmarks, we find that small relaxations in MSE ($\boldsymbol{\le 5\%}$) frequently unlock disproportionate gains in marginal realism, with median improvements of $\mathbf{17.3\%}$ and gains exceeding $\mathbf{30\%}$ in some datasets.} We further show that common forecasting strategies systematically occupy different regions of this frontier: direct multi-output predictors concentrate near the accuracy-optimal extreme, while recursive strategies and sample-based inference favors marginal realism. Together, these results expose a structural failure mode of MSE-based evaluation in long-horizon forecasting and recast strategy and inference selection as navigation of an unavoidable accuracy--realism trade-off.