Link Wars: The Semantic Crisis. Is the debate over or is it just beginning?
2026-03-08 • Distributed, Parallel, and Cluster Computing
Distributed, Parallel, and Cluster Computing
AI summaryⓘ
The authors explain that networking systems have repeatedly broken down because different makers add their own unique but incompatible features, leading to confusion and inefficiency. They identify a core problem called the Forward-In-Time-Only mistake, where networks lack clear, testable rules about how data is ordered, completed, or fails. This causes various issues across popular technologies like RDMA and GPU links. The authors propose a new approach called Open Atomic Ethernet, which defines explicit rules for transactions to fix these problems, and they question whether a single open standard can still unify this fragmented landscape.
RDMAnetwork fabricconcurrencyserializationordering semanticsOpen Atomic Ethernetuniversal fencingtransaction primitivesmulti-cloud orderingscalable OLTP isolation
Authors
Paul Borrill
Abstract
For fifty years, networking has fragmented whenever new workloads exposed hidden assumptions about time, ordering, failure, and trust. This paper argues that the current interconnect landscape -- NVLink, UALink, Ultra Ethernet, AELink/Aethernet, TTPoE, and classical RDMA -- suffers from a semantic crisis: vendor-specific divergence disguised as optimization. We trace this crisis to the Forward-In-Time-Only (FITO) category mistake embedded in every major fabric stack, and show how each pathology -- aspirational RDMA completion, fire-and-forget GPU semantics, opaque proprietary stacks, incompatible multi-cloud ordering, universal fencing -- arises from the same failure to define explicit, testable link semantics from APIs to bits on the wire. We conjecture that RDMA achieves reliability through universal fencing that collapses concurrency into serialized checkpoints, and that precise minimal semantics can maintain correctness without global barriers, as superscalar architectures separated execution from retirement. We describe how Open Atomic Ethernet (OAE) under the Open Compute Project addresses the crisis through bilateral transaction primitives with explicit ordering, completion, and failure visibility. Drawing on Helland's analysis of scalable OLTP isolation (the "BIG DEAL"), we show the crisis pervades the entire stack. We assess whether convergence on a single open standard is still possible or whether fragmentation is now structural.