๐ค AI Summary
Modeling quantum transport in sub-5 nm nanoribbon field-effect transistors (NRFETs) remains challenging due to strong electron correlation effects, which conventional DFT+NEGF approaches fail to capture accurately. To address this, we propose the first integration of the GW approximation into the nonequilibrium Greenโs function (NEGF) framework, establishing a scalable DFT+NEGF+GW co-simulation methodology. A novel spatial-domain decomposition algorithm is introduced to overcome scalability bottlenecks in large-scale device simulations, enabling experimentally relevant NRFET modeling with up to 84,480 atoms. Leveraging high-performance parallel architectures, the method achieves strong scalability on the Alps and Frontier supercomputers: for a 42,240-atom system, it delivers 1.15 exaFLOP/s sustained double-precision performance, with weak-scaling efficiency exceeding 80%. This work provides the first practical, high-accuracy quantum transport simulation paradigm for strongly correlated nanoelectronic devices.
๐ Abstract
Designing nanoscale electronic devices such as the currently manufactured nanoribbon field-effect transistors (NRFETs) requires advanced modeling tools capturing all relevant quantum mechanical effects. State-of-the-art approaches combine the non-equilibrium Green's function (NEGF) formalism and density functional theory (DFT). However, as device dimensions do not exceed a few nanometers anymore, electrons are confined in ultra-small volumes, giving rise to strong electron-electron interactions. To account for these critical effects, DFT+NEGF solvers should be extended with the GW approximation, which massively increases their computational intensity. Here, we present the first implementation of the NEGF+GW scheme capable of handling NRFET geometries with dimensions comparable to experiments. This package, called QuaTrEx, makes use of a novel spatial domain decomposition scheme, can treat devices made of up to 84,480 atoms, scales very well on the Alps and Frontier supercomputers (>80% weak scaling efficiency), and sustains an exascale FP64 performance on 42,240 atoms (1.15 Eflop/s).