Split Federated Learning Architectures for High-Accuracy and Low-Delay Model Training

2026-03-09 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors studied how to split machine learning models in a three-level system called Hierarchical Split Federated Learning (HSFL) to make training better and faster. They found that where you divide the model and how clients connect to the middle layer affects accuracy, speed, and communication needs. They created a mathematical way to find the best splitting strategy but showed it's a hard problem to solve perfectly. To tackle this, they designed a smart method that improves accuracy by 3%, makes training 20% faster, and cuts communication by half compared to earlier methods.

Split Federated LearningHierarchical SFLModel PartitioningTraining LossClient-Aggregator AssignmentOptimization ProblemHeuristic AlgorithmTraining DelayCommunication OverheadMachine Learning Accuracy

Authors

Yiannis Papageorgiou, Yannis Thomas, Ramin Khalili, Iordanis Koutsopoulos

Abstract

Can we find a network architecture for ML model training so as to optimize training loss (and thus, accuracy) in Split Federated Learning (SFL)? And can this architecture also reduce training delay and communication overhead? While accuracy is not influenced by how we split the model in ordinary, state-of-the-art SFL, in this work we answer the questions above in the affirmative. Recent Hierarchical SFL (HSFL) architectures adopt a three-tier training structure consisting of clients, (local) aggregators, and a central server. In this architecture, the model is partitioned at two partitioning layers into three sub-models, which are executed across the three tiers. Despite their merits, HSFL architectures overlook the impact of the partitioning layers and client-to-aggregator assignments on accuracy, delay, and overhead. This work explicitly captures the impact of the partitioning layers and client-to-aggregator assignments on accuracy, delay and overhead by formulating a joint optimization problem. We prove that the problem is NP-hard and propose the first accuracy-aware heuristic algorithm that explicitly accounts for model accuracy, while remaining delay-efficient. Simulation results on public datasets show that our approach can improve accuracy by 3%, while reducing delay by 20% and overhead by 50%, compared to state-of-the-art SFL and HSFL schemes.

View PDFOpen arXiv