IntraShuffler: A Privacy Preserving Framework for Heterogeneous DP Federated Learning

2026-06-01Machine Learning

Machine LearningCryptography and SecurityDistributed, Parallel, and Cluster Computing
AI summary

The authors study how in federated learning, clients can choose different privacy levels, but this can accidentally leak information to the server through the way updates are combined. They show that the server can guess private details about clients by using clever analysis of the updates. Since existing methods that mix updates to hide identities don't work well with differing privacy levels, the authors propose IntraShuffler, which groups similar-privacy clients and shuffles their updates in a way that keeps privacy while still allowing fair combination. Their experiments demonstrate that IntraShuffler significantly reduces privacy risks without hurting the quality of the learned model.

Federated LearningDifferential PrivacyPrivacy BudgetNon-IID DataGradient AggregationPrivacy Inference AttackShuffle ModelClient AnonymizationGradient DenoisingSurrogate Modeling
Authors
Farhin Farhad Riya, Olivera Kotevska, Jinyuan Stella Sun
Abstract
Heterogeneous Differential Privacy (HDP) in Federated Learning (FL) allows clients to select individual privacy budgets ($\varepsilon_i$) according to institutional policies and data sensitivity. In practice, many HDP-FL systems employ $\varepsilon$-aware server aggregation to improve model utility by re-weighting client updates according to their declared privacy budgets. However, gradient updates in FL retain structural patterns induced by non-independent and identically-distributed (non-IID) data, and these additional signals exposed by $\varepsilon$-aware aggregation create new opportunities for inference by an honest-but-curious server. In this work, we first show that a server equipped with gradient denoising and surrogate modeling can mount a \emph{Privacy Inference Attack} that infers distributional attributes of clients and links updates from the same client across training rounds, measured via surrogate inference accuracy and linkage success, under realistic knowledge constraints. The Shuffle-Model has been widely studied as a defense against such inference risks by anonymizing update sources, but it is fundamentally incompatible with HDP-FL $\varepsilon$-aware aggregation. To address this challenge, we propose \textbf{IntraShuffler}, a middleware defense framework designed for HDP-FL systems. IntraShuffler introduces a privacy-aware shuffling mechanism that groups clients into privacy-compatible buckets and performs parameter-level shuffling within each bucket to disrupt persistent gradient structure while preserving $\varepsilon$-aware aggregation. Experiments across four different datasets show that IntraShuffler reduces gradient recoverability by over 60% and decreases surrogate inference accuracy from 0.78 to 0.33 while maintaining comparable model utility across multiple FL aggregation rules.