🤖 AI Summary
To address the bottlenecks in LLM-generated Verilog code for FSM control logic—including high syntactic error rates, low debugging efficiency, and heavy reliance on manual testing—this paper proposes an IR-guided multi-agent collaborative framework. We design a structured intermediate representation (IR) to decouple semantic intent from syntactic constraints, build a robust IR-to-Verilog compiler, and pioneer an end-to-end workflow integrating SystemC modeling with automated testbench generation. Our contributions are threefold: (1) the first IR-driven paradigm for FSM code generation; (2) the first unified framework supporting SystemC simulation and automatic testbench synthesis; and (3) the open-source hierarchical FSM benchmark SKT-FSM, comprising 67 diverse cases. Evaluated on SKT-FSM, our approach achieves up to a 11.94% absolute improvement in pass rate and reduces syntax errors by up to 17.62% over MAGE, significantly enhancing RTL correctness and debuggability.
📝 Abstract
With the rapid advancement of large language models (LLMs) in code generation, their applications in hardware design are receiving growing attention. However, existing LLMs face several challenges when generating Verilog code for finite state machine (FSM) control logic, including frequent syntax errors, low debugging efficiency, and heavy reliance on test benchmarks. To address these challenges, this paper proposes AutoFSM, a multi-agent collaborative framework designed for FSM code generation tasks. AutoFSM introduces a structurally clear intermediate representation (IR) to reduce syntax error rate during code generation and provides a supporting toolchain to enable automatic translation from IR to Verilog. Furthermore, AutoFSM is the first to integrate SystemC-based modeling with automatic testbench generation, thereby improving debugging efficiency and feedback quality. To systematically evaluate the framework's performance, we construct SKT-FSM, the first hierarchical FSM benchmark in the field, comprising 67 FSM samples across different complexity levels. Experimental results show that, under the same base LLM, AutoFSM consistently outperforms the open-source framework MAGE on the SKT-FSM benchmark, achieving up to an 11.94% improvement in pass rate and up to a 17.62% reduction in syntax error rate. These results demonstrate the potential of combining LLMs with structured IR and automated testing to improve the reliability and scalability of register-transfer level (RTL) code generation.