🤖 AI Summary
This work addresses the absence of expert reasoning traces that capture how hardware constraints and temporal semantics are navigated in industrial software development, such as chip design and GPU optimization. The authors propose a novel approach that integrates Error-driven Chain-of-Thought (ECoT) with an Industrial Code World Model (ICWM), synthesizing multi-turn dialogues containing error-feedback reasoning chains and training on domain-specific execution trajectories—including Verilog simulation logs and GPU performance profiles—to enable the model to predict execution outcomes and perform self-verification. By uniquely combining error-driven reasoning with industrial-scale execution traces, the method achieves state-of-the-art performance among open-source models, scoring 81.3% on LiveCodeBench v5 across 14 general benchmarks and attaining 84.0% on CAD-Coder and 38.0% on KernelBench across 9 industrial benchmarks.
📝 Abstract
Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces. Specifically, ECoT generates reasoning chains by synthesizing the thinking content from multi-turn dialogue with environmental error feedback, explicitly modeling the error-correction process. ICWM is trained on domain-specific execution traces from Verilog simulation, GPU profiling, etc., learns the causal dynamics of how code affects hardware behavior, and enables self-verification by predicting execution outcomes before actual compilation. All synthesized reasoning traces are validated through domain toolchains, creating training data matching the natural reasoning depth distribution of industrial tasks. Evaluation on 14 general (81.3% on LiveCodeBench v5) and 9 industrial benchmarks (84.0% in CAD-Coder and 38.0% on KernelBench) shows InCoder-32B-Thinking achieves top-tier open-source results across all domains.GPU Optimization