🤖 AI Summary
Diffusion models face significant bottlenecks in energy efficiency and computational throughput on conventional electronic platforms, hindering their ability to meet the sustainable acceleration demands of generative AI. This work proposes a novel accelerator architecture leveraging silicon photonics for the first time to hardware-accelerate diffusion models, specifically targeting the iterative denoising computations in UNet and attention layers. By harnessing photonic computing for low-power, high-bandwidth matrix operations, the design substantially outperforms existing electronic approaches—achieving at least a 3× improvement in energy efficiency and a 5.5× increase in throughput. The proposed architecture establishes a new hardware paradigm that reconciles high performance with sustainability for generative AI systems.
📝 Abstract
Diffusion models have revolutionized generative AI, with their inherent capacity to generate highly realistic state-of-the-art synthetic data. However, these models employ an iterative denoising process over computationally intensive layers such as UNets and attention mechanisms. This results in high inference energy on conventional electronic platforms, and thus, there is an emerging need to accelerate these models in a sustainable manner. To address this challenge, we present a novel silicon photonics-based accelerator for diffusion models. Experimental evaluations demonstrate that our photonic accelerator achieves at least 3x better energy efficiency and 5.5x throughput improvement compared to state-of-the-art diffusion model accelerators.