🤖 AI Summary
To address the need for high-performance scientific computing, this paper designs and implements a lightweight, lock-free, and highly portable C++17 thread pool that requires no advanced multithreading APIs and enables fine-grained parallelism control. Methodologically, it leverages standard C++17 features—namely `std::thread`, `std::optional`, and `constexpr if`—to achieve full standard-library portability, augmented by lock-free queue optimizations. Key contributions include: (1) a zero-allocation task submission interface built on `std::variant` and coroutine-aware scheduling; (2) a dynamically prioritized task queue coupled with an RAII-based exception propagation mechanism; and (3) a purely standard-compliant implementation with minimal runtime dependencies. Experimental evaluation on multicore CPUs demonstrates near-linear speedup scalability, up to 3.2× improvement in task throughput, and a 92% reduction in memory allocation overhead compared to baseline approaches.