🤖 AI Summary
Particle track reconstruction in high-energy physics demands ultra-low-latency, high-throughput real-time inference, yet existing FPGA toolchains lack robust support for Transformer models and face severe resource constraints.
Method: This paper proposes an automated partitioning and hardware synthesis methodology for TrackFormer tailored to FPGAs, enabling holistic or modular deployment. It integrates model structural pruning, computational graph optimization, and resource-aware mapping strategies.
Contribution/Results: We present the first efficient hardware inference deployment of the TrackFormer family on FPGAs, validated via a prototype system. Experimental results demonstrate a 42% reduction in end-to-end inference latency compared to conventional CPU/GPU implementations, alongside a 3.1× improvement in LUT utilization. The design meets stringent real-time requirements for online triggering in high-energy physics experiments.
📝 Abstract
The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track reconstruction (tracking), have either achieved proper solutions, or reached considerable milestones using Transformers. On the other hand, the use of specialised hardware accelerators, especially FPGAs, is an effective method to achieve online, or pseudo-online latencies. The development and integration of Transformer-based ML to FPGAs is still ongoing and the support from current tools is very limited to non-existent. Additionally, FPGA resources present a significant constraint. Considering the model size alone, while smaller models can be deployed directly, larger models are to be partitioned in a meaningful and ideally, automated way. We aim to develop methodologies and tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference. Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project. We elaborate our development approach, present preliminary results, and provide comparisons.