FedZMG: Efficient Client-Side Optimization in Federated Learning
2026-02-20 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors address a common problem in Federated Learning where data differences across devices slow down learning and reduce accuracy. They propose FedZMG, a new method that adjusts gradients on each device so biases from uneven data are removed, without extra communication or tuning. Their theory and tests on several datasets show that FedZMG speeds up training and improves model accuracy compared to existing methods, especially when data is very different across devices.
Federated Learningnon-IID dataclient-driftgradient centralizationoptimization algorithmFedAvgFedAdamconvergence speededge devicesdata privacy
Authors
Fotios Zantalis, Evangelos Zervas, Grigorios Koulouras
Abstract
Federated Learning (FL) enables distributed model training on edge devices while preserving data privacy. However, clients tend to have non-Independent and Identically Distributed (non-IID) data, which often leads to client-drift, and therefore diminishing convergence speed and model performance. While adaptive optimizers have been proposed to mitigate these effects, they frequently introduce computational complexity or communication overhead unsuitable for resource-constrained IoT environments. This paper introduces Federated Zero Mean Gradients (FedZMG), a novel, parameter-free, client-side optimization algorithm designed to tackle client-drift by structurally regularizing the optimization space. Advancing the idea of Gradient Centralization, FedZMG projects local gradients onto a zero-mean hyperplane, effectively neutralizing the "intensity" or "bias" shifts inherent in heterogeneous data distributions without requiring additional communication or hyperparameter tuning. A theoretical analysis is provided, proving that FedZMG reduces the effective gradient variance and guarantees tighter convergence bounds compared to standard FedAvg. Extensive empirical evaluations on EMNIST, CIFAR100, and Shakespeare datasets demonstrate that FedZMG achieves better convergence speed and final validation accuracy compared to the baseline FedAvg and the adaptive optimizer FedAdam, particularly in highly non-IID settings.