π€ AI Summary
This work addresses the communication optimization of contracting a 3D symmetric tensor with the same vector along two modesβa key computational bottleneck in higher-order power methods for eigenpair computation and gradient-based symmetric CP decomposition. We first derive a tight parallel communication lower bound for this operation. Then, we propose an optimal data distribution strategy based on a generalized triangular block partitioning scheme and prove its asymptotic optimality with respect to the derived lower bound. Our approach integrates geometric inequality analysis, symmetric tensor blocking modeling, and precise communication complexity characterization, significantly reducing data movement in large-scale settings. Experimental results demonstrate that the proposed algorithm achieves communication volumes close to the theoretical optimum across diverse parallel configurations. This provides a provably efficient foundation for distributed high-performance computing of symmetric tensors.
π Abstract
In this article, we focus on the parallel communication cost of multiplying the same vector along two modes of a $3$-dimensional symmetric tensor. This is a key computation in the higher-order power method for determining eigenpairs of a $3$-dimensional symmetric tensor and in gradient-based methods for computing a symmetric CP decomposition. We establish communication lower bounds that determine how much data movement is required to perform the specified computation in parallel. The core idea of the proof relies on extending a key geometric inequality for $3$-dimensional symmetric computations. We demonstrate that the communication lower bounds are tight by presenting an optimal algorithm where the data distribution is a natural extension of the triangle block partition scheme for symmetric matrices to 3-dimensional symmetric tensors.