Temporal Patch Shuffle (TPS): Leveraging Patch-Level Shuffling to Boost Generalization and Robustness in Time Series Forecasting

2026-04-10Machine Learning

Machine Learning
AI summary

The authors introduce a new data augmentation technique called Temporal Patch Shuffle (TPS) for time series forecasting, which helps models learn better from limited data. TPS works by dividing time series into overlapping parts, shuffling some patches carefully to keep important local time patterns, and then stitching them back together. They tested TPS on various forecasting tasks and models, showing consistent improvements. Their experiments also explain why and how TPS works effectively.

Data augmentationTime series forecastingTemporal coherencePatch shufflingModel generalizationLong-term forecastingShort-term forecastingTemporal patchesVariance-based orderingDeep learning models
Authors
Jafar Bakhshaliyev, Johannes Burchert, Niels Landwehr, Lars Schmidt-Thieme
Abstract
Data augmentation is a crucial technique for improving model generalization and robustness, particularly in deep learning models where training data is limited. Although many augmentation methods have been developed for time series classification, most are not directly applicable to time series forecasting due to the need to preserve temporal coherence. In this work, we propose Temporal Patch Shuffle (TPS), a simple and model-agnostic data augmentation method for forecasting that extracts overlapping temporal patches, selectively shuffles a subset of patches using variance-based ordering as a conservative heuristic, and reconstructs the sequence by averaging overlapping regions. This design increases sample diversity while preserving forecast-consistent local temporal structure. We extensively evaluate TPS across nine long-term forecasting datasets using five recent model families (TSMixer, DLinear, PatchTST, TiDE, and LightTS), and across four short-term forecasting datasets using PatchTST, observing consistent performance improvements. Comprehensive ablation studies further demonstrate the effectiveness, robustness, and design rationale of the proposed method.