🤖 AI Summary
This work addresses the joint optimization of communication and computation overheads in distributed computing, where a master node coordinates \(N\) workers to compute a set of subfunctions dependent on \(d\) input files. The problem is modeled as a \(d\)-uniform hypergraph edge partitioning task, and a deterministic Interweaved-Cliques (IC) assignment scheme is proposed. This scheme achieves order-optimal communication load (number of files received per worker) and computation load (number of subfunctions processed per worker) without prior knowledge of the subfunction structure. Leveraging an information-theoretically inspired interwoven clique construction and a deterministic allocation strategy, the method applies to any multi-function decomposition satisfying mild density conditions, requires no file reassignment, and attains order-optimal communication cost \(\Theta(n/N^{1/d})\) and computation cost across a broad range of parameters, yielding a partitioning gain of \(N^{1/d}\).
📝 Abstract
We study the joint minimization of communication and computation costs in distributed computing, where a master node coordinates $N$ workers to evaluate a function over a library of $n$ files. Assuming that the function is decomposed into an arbitrary subfunction set $\mathbf{X}$, with each subfunction depending on $d$ input files, renders our distributed computing problem into a $d$-uniform hypergraph edge partitioning problem wherein the edge set (subfunction set), defined by $d$-wise dependencies between vertices (files) must be partitioned across $N$ disjoint groups (workers). The aim is to design a file and subfunction allocation, corresponding to a partition of $\mathbf{X}$, that minimizes the communication cost $\pi_{\mathbf{X}}$, representing the maximum number of distinct files per server, while also minimizing the computation cost $\delta_{\mathbf{X}}$ corresponding to a maximal worker subfunction load. For a broad range of parameters, we propose a deterministic allocation solution, the \emph{Interweaved-Cliques (IC) design}, whose information-theoretic-inspired interweaved clique structure simultaneously achieves order-optimal communication and computation costs, for a large class of decompositions $\mathbf{X}$. This optimality is derived from our achievability and converse bounds, which reveal -- under reasonable assumptions on the density of $\mathbf{X}$ -- that the optimal scaling of the communication cost takes the form $n/N^{1/d}$, revealing that our design achieves the order-optimal \textit{partitioning gain} that scales as $N^{1/d}$, while also achieving an order-optimal computation cost. Interestingly, this order optimality is achieved in a deterministic manner, and very importantly, it is achieved blindly from $\mathbf{X}$, therefore enabling multiple desired functions to be computed without reshuffling files.