DALI-PD: Diffusion-based Synthetic Layout Heatmap Generation for ML in Physical Design

πŸ“… 2025-07-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Physical design (PD) machine learning models suffer from a scarcity of high-quality, large-scale real-world data; existing public datasets are small, slow to generate, outdated, and constrained by intellectual property and computational costs. To address this, we propose DALI-PDβ€”the first scalable framework integrating diffusion models into PD heatmap synthesis, incorporating domain-specific physical design semantics to efficiently generate multimodal heatmaps, including power dissipation, IR-drop, congestion, macro placement, and cell density. DALI-PD enables second-level, diverse-sample inference, overcoming traditional generative bottlenecks. Leveraging DALI-PD, we construct a large-scale synthetic dataset comprising over 20,000 layout configurations, with high-fidelity heatmaps. Empirical evaluation demonstrates substantial improvements in model generalization and accuracy on downstream tasks, particularly IR-drop and congestion prediction.

Technology Category

Application Category

πŸ“ Abstract
Machine learning (ML) has demonstrated significant promise in various physical design (PD) tasks. However, model generalizability remains limited by the availability of high-quality, large-scale training datasets. Creating such datasets is often computationally expensive and constrained by IP. While very few public datasets are available, they are typically static, slow to generate, and require frequent updates. To address these limitations, we present DALI-PD, a scalable framework for generating synthetic layout heatmaps to accelerate ML in PD research. DALI-PD uses a diffusion model to generate diverse layout heatmaps via fast inference in seconds. The heatmaps include power, IR drop, congestion, macro placement, and cell density maps. Using DALI-PD, we created a dataset comprising over 20,000 layout configurations with varying macro counts and placements. These heatmaps closely resemble real layouts and improve ML accuracy on downstream ML tasks such as IR drop or congestion prediction.
Problem

Research questions and friction points this paper is trying to address.

Limited availability of high-quality training datasets for ML in physical design
Existing datasets are static, slow to generate, and require frequent updates
Need for scalable synthetic layout heatmaps to improve ML model generalizability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion model generates diverse layout heatmaps
Scalable framework creates synthetic training datasets
Fast inference produces realistic layout configurations