An Asynchronous Many-Task Algorithm for Unstructured $S_{N}$ Transport on Shared Memory Systems

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

Discrete-ordinates $S_N$ transport solvers on unstructured meshes suffer from poor scalability on shared-memory systems due to complex data dependencies, irregular memory access patterns, and high-dimensional computational domains. Method: This paper proposes an asynchronous multi-task parallel algorithm that constructs a non-blocking execution model based on a task dependency graph, eliminating traditional synchronization barriers to enable fine-grained task scheduling and dynamic load balancing. The approach integrates shared-memory programming with hardware-aware optimizations. Contribution/Results: Evaluated across multiple many-core platforms—including Intel Xeon Phi and AMD EPYC—the algorithm achieves 1.8–3.2× higher speedup over baseline methods on configurations with 64+ cores, with strong scaling efficiency exceeding 75%. It significantly improves resource utilization and throughput for high-dimensional transport problems.

Technology Category

Application Category

📝 Abstract

Discrete ordinates $S_N$ transport solvers on unstructured meshes pose a challenge to scale due to complex data dependencies, memory access patterns and a high-dimensional domain. In this paper, we review the performance bottlenecks within the shared memory parallelization scheme of an existing transport solver on modern many-core architectures with high core counts. With this analysis, we then survey the performance of this solver across a variety of compute hardware. We then present a new Asynchronous Many-Task (AMT) algorithm for shared memory parallelism, present results showing an increase in computational performance over the existing method, and evaluate why performance is improved.

Problem

Research questions and friction points this paper is trying to address.

Addressing scalability challenges in unstructured SN transport solvers

Analyzing performance bottlenecks on modern many-core architectures

Developing asynchronous task algorithm for improved computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

AMT algorithm enables shared memory parallelism

Asynchronous many-task approach overcomes performance bottlenecks

New method increases computational performance on many-core systems

🔎 Similar Papers

No similar papers found.