🤖 AI Summary
To address high latency caused by dynamic hypothesis switching in wideband multi-hypothesis spectrum sensing, this paper proposes a kernel orchestration framework tailored for coarse-grained reconfigurable architectures. The framework clusters computation kernels with mutually exclusive activation patterns—such as FFT, matrix operations, and signal analysis—to share instruction memory, formulating the problem as a multi-objective optimization that jointly optimizes kernel clustering, hardware placement, and prefetch scheduling, while preserving dataflow efficiency and minimizing context-switch overhead. Experimental results on a 48-subchannel scenario show that, compared to the baseline: off-chip instruction fetches are reduced by 207.81×, average switching latency drops by 98.24×, and per-subband execution throughput improves by 132.92×. The key innovation lies in the first systematic modeling of kernel temporal exclusivity as an instruction-memory reuse optimization, enabling low-overhead, high-throughput runtime dynamic hypothesis switching.
📝 Abstract
Efficient wideband spectrum sensing requires rapid evaluation and re-evaluation of signal presence and type across multiple subchannels. These tasks involve multiple hypothesis testing, where each hypothesis is implemented as a decision tree workflow containing compute-intensive kernels, including FFT, matrix operations, and signal-specific analyses. Given dynamic nature of the spectrum environment, ability to quickly switch between hypotheses is essential for maintaining low-latency, high-throughput operation. This work assumes a coarse-grained reconfigurable architecture consisting of an array of processing elements (PEs), each equipped with a local instruction memory (IMEM) capable of storing and executing kernels used in spectrum sensing applications. We propose a planner tool that efficiently maps hypothesis workflows onto this architecture to enable fast runtime context switching with minimal overhead. The planner performs two key tasks: clustering temporally non-overlapping kernels to share IMEM resources within a PE sub-array, and placing these clusters onto hardware to ensure efficient scheduling and data movement. By preloading kernels that are not simultaneously active into same IMEM, our tool enables low-latency reconfiguration without runtime conflicts. It models the planning process as a multi-objective optimization, balancing trade-offs among context switch overhead, scheduling latency, and dataflow efficiency. We evaluate the proposed tool in simulated spectrum sensing scenario with 48 concurrent subchannels. Results show that our approach reduces off-chip binary fetches by 207.81x, lowers average switching time by 98.24x, and improves per-subband execution time by 132.92x over baseline without preloading. These improvements demonstrate that intelligent planning is critical for adapting to fast-changing spectrum environments in next-generation radio frequency systems.