THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the conflicting objectives of performance, energy efficiency, and thermal safety in AI workload scheduling for heterogeneous chiplet-level Processing-in-Memory (PIM) architectures, this paper proposes the first thermal-aware Multi-Objective Reinforcement Learning (MORL) scheduling framework tailored for chiplet-scale PIM. The method jointly models execution time, dynamic power consumption, and on-chip thermal evolution, enabling runtime generation of Pareto-optimal scheduling policies. It is technology-agnostic, supporting diverse memory technologies including ReRAM, SRAM, and FeFET. Experimental evaluation demonstrates that, compared to baseline approaches, the framework achieves an average speedup of 1.89× and reduces energy consumption by 57%, while incurring only 0.14% runtime overhead and 0.022% additional energy cost. This work marks the first holistic co-optimization of performance, energy efficiency, and thermal safety at the chiplet granularity in PIM systems.

Technology Category

Application Category

📝 Abstract

Chiplet-based integration enables large-scale systems that combine diverse technologies, enabling higher yield, lower costs, and scalability, making them well-suited to AI workloads. Processing-in-Memory (PIM) has emerged as a promising solution for AI inference, leveraging technologies such as ReRAM, SRAM, and FeFET, each offering unique advantages and trade-offs. A heterogeneous chiplet-based PIM architecture can harness the complementary strengths of these technologies to enable higher performance and energy efficiency. However, scheduling AI workloads across such a heterogeneous system is challenging due to competing performance objectives, dynamic workload characteristics, and power and thermal constraints. To address this need, we propose THERMOS, a thermally-aware, multi-objective scheduling framework for AI workloads on heterogeneous multi-chiplet PIM architectures. THERMOS trains a single multi-objective reinforcement learning (MORL) policy that is capable of achieving Pareto-optimal execution time, energy, or a balanced objective at runtime, depending on the target preferences. Comprehensive evaluations show that THERMOS achieves up to 89% faster average execution time and 57% lower average energy consumption than baseline AI workload scheduling algorithms with only 0.14% runtime and 0.022% energy overhead.

Problem

Research questions and friction points this paper is trying to address.

Scheduling AI workloads on heterogeneous multi-chiplet PIM architectures

Balancing performance, energy, and thermal constraints dynamically

Optimizing execution time and energy efficiency with minimal overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective reinforcement learning for scheduling

Thermally-aware AI workload optimization

Heterogeneous multi-chiplet PIM architecture

🔎 Similar Papers

No similar papers found.

Authors to Follow