Expanding into Reality: Random Graphs for Datacenter Networks

2026-04-16Networking and Internet Architecture

Networking and Internet Architecture
AI summary

The authors created and used a new type of network design for Amazon's data centers based on random graphs, which are known to be cost-effective and reliable but were hard to build before. They developed a special routing method that uses the nature of random graphs to find many different paths between computers, improving communication. They also introduced a new device that simplifies how cables are connected, making the setup as easy as traditional designs. Their network design, called RNG, performed as well or better than common systems while costing less, and Amazon now uses RNG for most of its workloads.

random graphsdatacenter networksrouting protocolfault toleranceedge disjoint pathspassive optical devicecabling complexityfat treesscalable routing
Authors
Giacomo Bernardi, Ratul Mahajan, C. Seshadhri, Enrico Carlesso, Chinchu Merine Joseph, Saurabh Kumar, Pavan Manikonda, Luiza Popa, Randy Ram, Steven Robinson, Elizabeth Tennent
Abstract
We design and deploy at Amazon the first production datacenter fabrics based on random graphs. While the cost and fault-tolerance benefits of such topologies have been long known, their practical realization has been hampered by a lack of scalable routing and cabling approaches. Our design, called RNG, has a new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths between endpoint pairs. A novel passive optical device that internally shuffles cable endpoints makes Amazon's cabling complexity similar to that of fat trees. We show that RNG fabrics match or exceed the performance of fat trees for a range of traffic patterns, despite being up to 45% cheaper. At Amazon, we made RNG the default datacenter fabric for most workloads.