π€ AI Summary
Physical design (PD) machine learning models suffer from a scarcity of high-quality, large-scale real-world data; existing public datasets are small, slow to generate, outdated, and constrained by intellectual property and computational costs. To address this, we propose DALI-PDβthe first scalable framework integrating diffusion models into PD heatmap synthesis, incorporating domain-specific physical design semantics to efficiently generate multimodal heatmaps, including power dissipation, IR-drop, congestion, macro placement, and cell density. DALI-PD enables second-level, diverse-sample inference, overcoming traditional generative bottlenecks. Leveraging DALI-PD, we construct a large-scale synthetic dataset comprising over 20,000 layout configurations, with high-fidelity heatmaps. Empirical evaluation demonstrates substantial improvements in model generalization and accuracy on downstream tasks, particularly IR-drop and congestion prediction.
π Abstract
Machine learning (ML) has demonstrated significant promise in various physical design (PD) tasks. However, model generalizability remains limited by the availability of high-quality, large-scale training datasets. Creating such datasets is often computationally expensive and constrained by IP. While very few public datasets are available, they are typically static, slow to generate, and require frequent updates. To address these limitations, we present DALI-PD, a scalable framework for generating synthetic layout heatmaps to accelerate ML in PD research. DALI-PD uses a diffusion model to generate diverse layout heatmaps via fast inference in seconds. The heatmaps include power, IR drop, congestion, macro placement, and cell density maps. Using DALI-PD, we created a dataset comprising over 20,000 layout configurations with varying macro counts and placements. These heatmaps closely resemble real layouts and improve ML accuracy on downstream ML tasks such as IR drop or congestion prediction.