Catastrophic Forgetting Resilient One-Shot Incremental Federated Learning

2026-02-19 • Machine Learning

Machine LearningDistributed, Parallel, and Cluster Computing

AI summaryⓘ

The authors developed a new federated learning method called OSI-FL to help train models on data that arrives bit by bit from different places, without sending all the data to one spot. They use a clever way to share small, privacy-safe pieces of information from each user's data in just one send, then the server creates fake but similar data to train the model. To avoid the model forgetting old information when learning new tasks, the authors add a way to keep the most important example data for future training. Their experiments show this method works better than older ways in different learning scenarios.

Federated LearningIncremental LearningCatastrophic ForgettingVision-Language ModelDiffusion ModelSelective Sample RetentionSynthetic DataCommunication OverheadClass-Incremental ScenarioDomain-Incremental Scenario

Authors

Obaidullah Zaland, Zulfiqar Ahmad Khan, Monowar Bhuyan

Abstract

Modern big-data systems generate massive, heterogeneous, and geographically dispersed streams that are large-scale and privacy-sensitive, making centralization challenging. While federated learning (FL) provides a privacy-enhancing training mechanism, it assumes a static data flow and learns a collaborative model over multiple rounds, making learning with \textit{incremental} data challenging in limited-communication scenarios. This paper presents One-Shot Incremental Federated Learning (OSI-FL), the first FL framework that addresses the dual challenges of communication overhead and catastrophic forgetting. OSI-FL communicates category-specific embeddings, devised by a frozen vision-language model (VLM) from each client in a single communication round, which a pre-trained diffusion model at the server uses to synthesize new data similar to the client's data distribution. The synthesized samples are used on the server for training. However, two challenges still persist: i) tasks arriving incrementally need to retrain the global model, and ii) as future tasks arrive, retraining the model introduces catastrophic forgetting. To this end, we augment training with Selective Sample Retention (SSR), which identifies and retains the top-p most informative samples per category and task pair based on sample loss. SSR bounds forgetting by ensuring that representative retained samples are incorporated into training in further iterations. The experimental results indicate that OSI-FL outperforms baselines, including traditional and one-shot FL approaches, in both class-incremental and domain-incremental scenarios across three benchmark datasets.

View PDFOpen arXiv