π€ AI Summary
This work addresses the fragmented nature of existing neural network processor design across training, mapping, and manufacturing stages, which hinders joint optimization of performance, cost, and yield under uncertainty. The authors propose a unified framework grounded in monotonic co-design theory, decoupling training, mapping, manufacturing, and resource allocation through a functional-resource interface to enable both independent optimization and global coordination. A key innovation is the explicit introduction of βconfidenceβ as an optimizable resource within the design flow, allowing uncertainty to be formally modeled while guaranteeing that local improvements automatically advance the global Pareto front. The efficacy of the approach is demonstrated through three case studies: reproducing Pareto-optimal solutions in heterogeneous scenarios, validating confidence as a continuously tunable parameter, and achieving global performance gains without requiring hardware reconfiguration.
π Abstract
Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it difficult to improve one component without reworking the entire pipeline. This paper presents a unified framework, grounded in monotone co-design theory, that composes four interoperable design blocks spanning network training, chip mapping, wafer-level fabrication, and compute resource allocation. Each block exposes only a functionality-resource interface to the rest of the system, so any block can be refined without structural changes elsewhere. A central contribution is the treatment of uncertainty: rather than collapsing stochastic outcomes into point estimates, the framework introduces Confidence, the inverse of success probability, as an explicit and optimizable resource alongside cost, time, and power. Three case studies validate the approach. The first recovers Pareto-optimal implementations across heterogeneous application scenarios. The second confirms that Confidence functions as a continuously tunable design knob rather than a post-hoc diagnostic. The third demonstrates that improving a single block's implementation set automatically propagates to the global Pareto front, without modifying the co-design diagram.