ENLighten: Lighten the Transformer, Enable Efficient Optical Acceleration

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

Photonic computing for large Transformer models faces two critical bottlenecks: low energy efficiency—due to substantial electro-optical conversion and data movement overheads—and limited throughput—caused by insufficient on-chip photonic resources, necessitating frequent reuse of photonic tensor cores. To address these, this work proposes a hardware-software co-design framework integrating PTC-aware low-rank plus structured sparsity compression with dynamically reconfigurable photonic tensor cores, enabling fine-grained sparse activation and full-power gating. Key techniques include the Lighten compression pipeline, broadband optical redistribution, and joint structured pruning and low-rank decomposition. Evaluated on ImageNet, ViT achieves only ~1% accuracy degradation after 50% structured pruning and three rounds of fine-tuning. Compared to state-of-the-art photonic accelerators, the ENLighten platform delivers a 2.5× improvement in energy-delay product (EDP), significantly enhancing both energy efficiency and photonic resource utilization.

Technology Category

Application Category

📝 Abstract

Photonic computing has emerged as a promising substrate for accelerating the dense linear-algebra operations at the heart of AI, yet adoption for large Transformer models remains in its infancy. We identify two bottlenecks: (1) costly electro--optic conversions and data-movement overheads that erode energy efficiency as model sizes scale; (2) a mismatch between limited on-chip photonic resources and Transformer scale, which forces frequent reuse of photonic tensor cores and dilutes throughput gains. To address these challenges, we introduce a hardware--software co-design framework. First, we propose exttt{Lighten}, a PTC-aware compression flow that post-hoc decomposes each Transformer weight matrix into a low-rank component plus a structured-sparse component aligned to photonic tensor-core granularity, without lengthy retraining. Second, we present exttt{ENLighten}, a reconfigurable photonic accelerator with dynamically adaptive tensor cores, driven by broadband light redistribution, enabling fine-grained sparsity support and full power gating of inactive parts. On ImageNet, exttt{Lighten} prunes a Base-scale Vision Transformer by 50% with $approx$1% accuracy drop after only 3 epochs (about 1 hour) of fine-tuning. Deployed on exttt{ENLighten}, it achieves a $2.5 imes$ improvement in energy--delay product over the state-of-the-art photonic Transformer accelerator.

Problem

Research questions and friction points this paper is trying to address.

Reducing electro-optic conversion costs in photonic AI accelerators

Addressing limited on-chip photonic resources for Transformer models

Optimizing hardware-software co-design for efficient optical acceleration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-software co-design framework for photonic acceleration

Lighten compression flow decomposes weight matrices

ENLighten accelerator with adaptive tensor cores

🔎 Similar Papers

No similar papers found.