🤖 AI Summary
Simulating compressible flows for hypersonic multi-engine rockets with ultra-large-scale meshes (>10¹⁴ grid points) poses severe challenges in shock-capturing stability and computational scalability.
Method: This work introduces an information-geometric regularization paradigm that ensures intrinsic shock-resolution stability directly at the governing equation level, eliminating conventional explicit shock-capturing schemes. We further develop a CPU–GPU tightly coupled unified-addressing architecture and a linear stencil optimization algorithm to enable efficient deployment of the MFC solver on heterogeneous exascale platforms (AMD MI250X/NVIDIA GH200).
Contribution/Results: We achieve exascale full-vehicle simulations—up to 10¹⁵ grid points—on the Frontier and Alps supercomputers, advancing state-of-the-art mesh counts by one order of magnitude. Measured weak scaling efficiency approaches theoretical limits, and performance improves fourfold over baseline implementations. This work provides the first validation of the feasibility and scalability of information-geometric, high-fidelity, shock-resolving CFD for ultra-large-scale applications.
📝 Abstract
This work proposes a method and optimized implementation for exascale simulations of high-speed compressible fluid flows, enabling the simulation of multi-engine rocket craft at an unprecedented scale. We significantly improve upon the state-of-the-art in terms of computational cost and memory footprint through a carefully crafted implementation of the recently proposed information geometric regularization, which eliminates the need for numerical shock capturing. Unified addressing on tightly coupled CPU--GPU platforms increases the total problem size with negligible performance hit. Despite linear stencil algorithms being memory-bound, we achieve wall clock times that are four times faster than optimized baseline numerics. This enables the execution of CFD simulations at more than 100 trillion grid points, surpassing the largest state-of-the-art publicly available simulations by an order of magnitude. Ideal weak scaling is demonstrated on OLCF Frontier and CSCS Alps using the full system, entailing 37.8K AMD MI250X GPUs (Frontier) or 9.2K NVIDIA GH200 superchips (Alps).