🤖 AI Summary
High-performance computing (HPC) systems deliver exceptional computational power but lack cloud-native usability, accessibility, and service-oriented capabilities—hindering their deployment for public-facing AI inference and agent-based applications. Conversely, cloud-native technologies (e.g., Kubernetes, object storage) widely adopted by AI developers are poorly compatible with traditional HPC architectures. To bridge this gap, we propose a dual-stack AI factory architecture that synergistically integrates HPC and cloud-native paradigms, enabling the first deep convergence of serverless HPC and high-performance cloud computing. Our approach unifies orchestration across HPC clusters, Kubernetes, hardware accelerators, and cloud-native service frameworks—delivering sovereign AI infrastructure that combines peak performance with out-of-the-box usability. Deployed as the foundational blueprint for the EuroHPC AI Factory, this architecture significantly improves resource utilization and AI service accessibility, while scaling effectively to support large-scale inference and intelligent agent deployment.
📝 Abstract
The strategic importance of artificial intelligence is driving a global push toward Sovereign AI initiatives. Nationwide governments are increasingly developing dedicated infrastructures, called AI Factories (AIF), to achieve technological autonomy and secure the resources necessary to sustain robust local digital ecosystems.
In Europe, the EuroHPC Joint Undertaking is investing hundreds of millions of euros into several AI Factories, built atop existing high-performance computing (HPC) supercomputers. However, while HPC systems excel in raw performance, they are not inherently designed for usability, accessibility, or serving as public-facing platforms for AI services such as inference or agentic applications. In contrast, AI practitioners are accustomed to cloud-native technologies like Kubernetes and object storage, tools that are often difficult to integrate within traditional HPC environments.
This article advocates for a dual-stack approach within supercomputers: integrating both HPC and cloud-native technologies. Our goal is to bridge the divide between HPC and cloud computing by combining high performance and hardware acceleration with ease of use and service-oriented front-ends. This convergence allows each paradigm to amplify the other. To this end, we will study the cloud challenges of HPC (Serverless HPC) and the HPC challenges of cloud technologies (High-performance Cloud).