Rectangular Matrix Multiplication in the Low-Bandwidth Model
2026-06-03 • Distributed, Parallel, and Cluster Computing
Distributed, Parallel, and Cluster Computing
AI summaryⓘ
The authors study how to multiply rectangular matrices using many computers that can only send tiny messages each round. They explore how the time (round complexity) depends on the sizes of the matrices, focusing on two types of rectangular shapes. For one shape, they find a formula that generalizes known results for square matrices. For the other shape, they discover a different behavior with a clear cutoff point where the complexity changes. They provide both upper and lower bounds to show their results are tight in some cases.
rectangular matrix multiplicationdistributed computinglow-bandwidth modelround complexitysemiringsupper and lower boundscommunication roundsmatrix aspect ratiophase transition
Authors
Chetan Gupta, Jukka Suomela, Hossein Vahidi
Abstract
We study rectangular matrix multiplication in the low-bandwidth model of distributed computing. There are $n$ computers; initially the input matrices are distributed evenly between computers, and in each communication round every computer can send and receive an $O(\log n)$-bit message. Eventually each computer must output its designated part of the product matrix. While prior work has focused primarily on square $n \times n$ multiplication under various sparsity assumptions, we study rectangular instances with no sparsity assumption. We denote by $\langle a,b,c\rangle$ the task of multiplying an $a\times b$ matrix by a $b\times c$ matrix in this model. We concentrate on two natural aspect ratios, $\langle n,d,n\rangle$ and $\langle d,n,d\rangle$, for $d \le n$, and we study how the round complexity depends on $n$ and $d$. When $d \to n$, both $\langle n,d,n\rangle$ and $\langle d,n,d\rangle$ approach $\langle n,n,n\rangle$, which is the usual task of multiplying square matrices. If we consider multiplication over semirings, the current best upper bound in that case is $O(n^{4/3})$ rounds, and there is a trivial unconditional lower bound of $Ω(n)$. We show that for $\langle d,n,d\rangle$, we can achieve the complexity of $\tilde O(d^{4/3})$, which seems like a natural generalization of the upper bound $\tilde O(n^{4/3})$ when $d=n$. However, the case of $\langle n,d,n\rangle$ is fundamentally different, and also exhibits a phase transition. We show that for $d \le \sqrt{n}$, the complexity of $\langle n,d,n\rangle$ is $Θ(d \sqrt{n})$; we have matching upper and lower bounds. However, the behavior is genuinely different in the region $d \ge \sqrt{n}$, where the upper bound is $O(d^{2/3} n^{2/3})$.