π€ AI Summary
This work addresses the thermal constraints of in-orbit data centers, where limited radiative heat dissipation restricts the performance of conventional GPUs due to their high heat density and resulting hotspots that necessitate frequency throttling. To overcome this challenge, the authors propose a βradiator-in-the-loopβ co-design framework that, for the first time, jointly optimizes radiative cooling capacity with computational architecture energy efficiency. Through thermal simulations and multi-workload evaluations, they demonstrate that compute-in-memory (CIM) architectures exhibit significantly more uniform thermal distribution and higher TOPS/W efficiency compared to GPUs. Experimental results under realistic orbital thermal constraints show that CIM consistently outperforms GPUs across varying thermal budgets, thereby validating its feasibility and superiority as an AI accelerator for space applications.
π Abstract
The rapid growth in compute demand from artificial intelligence (AI) has driven a massive surge in data center construction, precipitating an energy and sustainability crisis. Motivated by the abundant solar energy in outer space and the recent sharp reduction in space launch costs, orbital data centers are emerging as a potential pathway for the future scaling of AI compute infrastructure. While the cold background in vacuum seems appealing for cooling, computing systems operating in space without convection ultimately rely on radiative cooling, requiring large-area radiators. Such limitations in thermal management pose a significant challenge for deploying the standard liquid/air-cooled computers in space. In this work, we investigate the impact of the thermal constraints in space on both graphics processing units (GPUs) with high-bandwidth memory (HBM) and the emerging compute-in-memory (CIM) accelerators. We develop a radiator-in-the-loop co-design methodology that directly links the permitted system TOPS (terra-operations per second) with the practical radiator cooling capacity in space. Our thermal simulations reveal that the separately located GPU die and HBMs create severe thermal hotspots under limited radiator capacity, necessitating GPU thermal throttling. In contrast, CIM accelerators exhibit a much more uniform heat distribution and consistently outperform GPUs in TOPS/W across a wide range of radiator budgets. We systematically evaluated the performance of CIM and GPU across various AI workloads and demonstrated that CIM has a magnified advantage for deployment in space under realistic thermal constraints.