SPRING: Systematic Profiling of Randomly Interconnected Neural Networks Generated by HLS

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Existing performance analysis tools for HLS-generated IP on FPGAs are constrained by dynamic architectures, requiring dedicated ports, exclusive BRAM usage, or vendor-specific primitives—hindering system-level dynamic observability. This paper proposes a streaming co-analysis architecture tailored for dynamically reconfigurable HLS-based neural networks. It introduces the first profiling mechanism synchronized with dataflow splitting and merging, enabling monitoring metadata to share the same data path as computational data—eliminating hardware modifications and vendor primitive dependencies. Built upon Vivado HLS, Zynq PS/PL heterogeneous integration, and a custom streaming monitoring protocol, the approach leverages RINN modeling and RTL-level co-simulation for validation. Evaluated on randomly interconnected neural networks, it achieves end-to-end profiling using FIFO occupancy as the key metric, reducing monitoring overhead by 67% and improving timing closure by 40%.

Technology Category

Application Category

📝 Abstract

Profiling is important for performance optimization by providing real-time observations and measurements of important parameters of hardware execution. Existing profiling tools for High-Level Synthesis (HLS) IPs running on FPGAs are far less mature compared with those developed for fixed CPU and GPU architectures and they still lag behind mainly due to their dynamic architecture. This limitation is reflected in the typical approach of extracting monitoring signals off of an FPGA device individually from dedicated ports, using one BRAM per signal for temporary information storage, or embedding vendor specific primitives to manually analyze the waveform. In this paper, we propose a systematic profiling method tailored to the dynamic nature of FPGA systems, particularly suitable for streaming accelerators. Instead of relying on signal extraction, the proposed profiling stream flows alongside the actual data, dynamically splitting and merging in synchrony with the data stream, and is ultimately directed to the processing system (PS) side. We conducted a preliminary evaluation of this method on randomly interconnected neural networks (RINNs) using the FIFO fullness metric, with co-simulation results for validation.

Problem

Research questions and friction points this paper is trying to address.

Profiling HLS IPs on FPGAs lacks maturity due to dynamic architecture

Existing methods inefficiently extract signals individually or use vendor primitives

Proposes systematic profiling for streaming accelerators with dynamic flow synchronization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic profiling stream alongside data flow

Synchronized splitting and merging with data

Directs profiling data to processing system

🔎 Similar Papers

Survey on Characterizing and Understanding GNNs from a Computer Architecture Perspective