🤖 AI Summary
This work addresses the complexity of developing numerical time integrators and the difficulty of algorithm switching on distributed heterogeneous platforms. We propose a deep integration of the Odeint and OpenFPM frameworks, leveraging template metaprogramming and a declarative integration interface to uniformly abstract multistage, multistep, and adaptive ODE solvers as plug-and-play components—enabling algorithm switching with a single line of code while preserving MPI+GPU parallel scalability. To our knowledge, this is the first unified architecture synergistically combining Boost.Odeint, OpenFPM, CUDA, and MPI. Within just 60 lines of core code, we achieve efficient CPU/GPU-accelerated simulation of the 3D Gray–Scott reaction-diffusion model, supporting both exponential and sigmoidal kinetics. Experimental evaluation demonstrates strong scalability and high computational efficiency across heterogeneous clusters.
📝 Abstract
We present a distributed algebra system for efficient and compact implementation of numerical time integration schemes on parallel computers and graphics processing units (GPU). The software implementation combines the time integration library Odeint from Boost with the OpenFPM framework for scalable scientific computing. Implementing multi-stage, multi-step, or adaptive time integration methods in distributed-memory parallel codes or on GPUs is challenging. The present algebra system addresses this by making the time integration methods from Odeint available in a concise template-expression language for numerical simulations distributed and parallelized using OpenFPM. This allows using state-of-the-art time integration schemes, or switching between schemes, by changing one line of code, while maintaining parallel scalability. This enables scalable time integration with compact code and facilitates rapid rewriting and deployment of simulation algorithms. We benchmark the present software for exponential and sigmoidal dynamics and present an application example to the 3D Gray-Scott reaction-diffusion problem on both CPUs and GPUs in only 60 lines of code.