🤖 AI Summary
This work addresses the challenge of adversarial server tolerance in distributed coded computation for unstructured tasks. Methodologically, it introduces the first fault-tolerant coding framework for general-purpose computation by integrating coding theory, robust statistics, and distributed optimization to design a novel encoding–decoding scheme under weak assumptions—thereby establishing the first theoretical foundation for general coded computation. Theoretical contributions include: (i) optimal tolerance to $O(N^a)$ adversarial servers, and (ii) a near-optimal approximation error decay rate of $N^{(6/5)(a-1)}$, the best known to date. Experiments demonstrate that the framework significantly improves both accuracy and fault tolerance in unstructured tasks—such as deep neural network inference—overcoming the longstanding limitation of conventional coded computation, which applies only to structured operations like matrix multiplication.
📝 Abstract
Conventional coded computing frameworks are predominantly tailored for structured computations, such as matrix multiplication and polynomial evaluation. Such tasks allow the reuse of tools and techniques from algebraic coding theory to improve the reliability of distributed systems in the presence of stragglers and adversarial servers. This paper lays the foundation for general coded computing, which extends the applicability of coded computing to handle a wide class of computations. In addition, it particularly addresses the challenging problem of managing adversarial servers. We demonstrate that, in the proposed scheme, for a system with $N$ servers, where $mathcal{O}(N^a)$, $a in [0,1)$, are adversarial, the supremum of the average approximation error over all adversarial strategies decays at a rate of $N^{frac{6}{5}(a-1)}$, under minimal assumptions on the computing tasks. Furthermore, we show that within a general framework, the proposed scheme achieves optimal adversarial robustness, in terms of maximum number of adversarial servers it can tolerate. This marks a significant step toward practical and reliable general coded computing. Implementation results further validate the effectiveness of the proposed method in handling various computations, including inference in deep neural networks.