🤖 AI Summary
Existing multigrid libraries exhibit insufficient speed and scalability for solving large-scale linear systems arising from structured grids. Method: This paper proposes an efficient parallel algebraic multigrid (AMG) preconditioner designed according to three principles: low computational overhead, strong convergence, and high parallelizability. It introduces the first stencil-based, symbolically derived triple-matrix multiplication for Galerkin coarsening, and develops a dependency-preserving unified parallel framework for sparse triangular solves, optimized across ARM and x86 architectures. Contribution/Results: Experiments on real-world applications—including radiation hydrodynamics—demonstrate that the proposed preconditioner achieves average speedups of 7.3×–15.5× over all hypre preconditioners. It significantly improves both strong and weak scaling efficiency, enabling scalable performance on large structured-grid problems.
📝 Abstract
Parallel multigrid is widely used as preconditioners in solving large-scale sparse linear systems. However, the current multigrid library still needs more satisfactory performance for structured grid problems regarding speed and scalability. Based on the classical 'multigrid seesaw', we derive three necessary principles for an efficient structured multigrid, which instructs our design and implementation of StructMG, a fast and scalable algebraic multigrid that constructs hierarchical grids automatically. As a preconditioner, StructMG can achieve both low cost per iteration and good convergence when solving large-scale linear systems with iterative methods in parallel. A stencil-based triple-matrix product via symbolic derivation and code generation is proposed for multi-dimensional Galerkin coarsening to reduce grid complexity, operator complexity, and implementation effort. A unified parallel framework of sparse triangular solver is presented to achieve fast convergence and high parallel efficiency for smoothers, including dependence-preserving Gauss-Seidel and incomplete LU methods. Idealized and real-world problems from radiation hydrodynamics, petroleum reservoir simulation, numerical weather prediction, and solid mechanics, are evaluated on ARM and X86 platforms to show StructMG's effectiveness. In comparison to extit{hypre}'s structured and general multigrid preconditioners, StructMG achieves the fastest time-to-solutions in all cases with average speedups of 15.5x, 5.5x, 6.7x, 7.3x over SMG, PFMG, SysPFMG, and BoomerAMG, respectively. StructMG also significantly improves strong and weak scaling efficiencies.