A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications
2026-03-04 • Networking and Internet Architecture
Networking and Internet ArchitectureMachine Learning
AI summaryⓘ
The authors address the challenge of delivering data packets on time in networks where delays matter a lot, like video calls or gaming. They point out that previous methods often only manage average delays, not strict deadlines for each packet. To fix this, they use a smart learning approach called constrained deep reinforcement learning to find the cheapest way to send packets while still meeting strict timing needs. Their method works better than older ones, delivering packets on time more reliably and using fewer resources.
real-time networkspacket deliverydelay constraintsresource allocationMarkov decision processdeep reinforcement learningnetwork controltimely throughputstochastic optimization
Authors
Ozan Aygün, Vincenzo Norman Vitale, Antonia M. Tulino, Hao Feng, Elza Erkip, Jaime Llorca
Abstract
Next-generation networks aim to provide performance guarantees to real-time interactive services that require timely and cost-efficient packet delivery. In this context, the goal is to reliably deliver packets with strict deadlines imposed by the application while minimizing overall resource allocation cost. A large body of work has leveraged stochastic optimization techniques to design efficient dynamic routing and scheduling solutions under average delay constraints; however, these methods fall short when faced with strict per-packet delay requirements. We formulate the minimum-cost delay-constrained network control problem as a constrained Markov decision process and utilize constrained deep reinforcement learning (CDRL) techniques to effectively minimize total resource allocation cost while maintaining timely throughput above a target reliability level. Results indicate that the proposed CDRL-based solution can ensure timely packet delivery even when existing baselines fall short, and it achieves lower cost compared to other throughput-maximizing methods.