FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels
2026-04-22 • Machine Learning
Machine LearningArtificial IntelligenceComputer Vision and Pattern RecognitionDistributed, Parallel, and Cluster Computing
AI summaryⓘ
The authors present FedSIR, a method to improve federated learning when some clients have incorrect labels in their data. Instead of focusing only on loss functions, they analyze the shared features' structure to spot which clients have clean or noisy labels. Clean clients help noisy ones fix wrong labels by comparing feature patterns, and a special training strategy helps make the overall learning more stable. Their experiments show that FedSIR works better than existing methods on standard tests.
Federated LearningNoisy LabelsSpectral AnalysisFeature RepresentationsLoss FunctionsKnowledge DistillationClass SubspacesLogit-Adjusted LossModel Aggregation
Authors
Sina Gholami, Abdulmoneam Ali, Tania Haghighi, Ahmed Arafa, Minhaj Nur Alam
Abstract
Federated learning (FL) enables collaborative model training without sharing raw data; however, the presence of noisy labels across distributed clients can severely degrade the learning performance. In this paper, we propose FedSIR, a multi-stage framework for robust FL under noisy labels. Different from existing approaches that mainly rely on designing noise-tolerant loss functions or exploiting loss dynamics during training, our method leverages the spectral structure of client feature representations to identify and mitigate label noise. Our framework consists of three key components. First, we identify clean and noisy clients by analyzing the spectral consistency of class-wise feature subspaces with minimal communication overhead. Second, clean clients provide spectral references that enable noisy clients to relabel potentially corrupted samples using both dominant class directions and residual subspaces. Third, we employ a noise-aware training strategy that integrates logit-adjusted loss, knowledge distillation, and distance-aware aggregation to further stabilize federated optimization. Extensive experiments on standard FL benchmarks demonstrate that FedSIR consistently outperforms state-of-the-art methods for FL with noisy labels. The code is available at https://github.com/sinagh72/FedSIR.