UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models

๐Ÿ“… 2026-03-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the lack of standardized modeling tools in existing research on distributed inference, which hinders reproducibility and systematic evaluation. To bridge this gap, the authors propose a unified framework based on discrete-event simulation (DES) that supports the development, benchmarking, and deployment of distributed AI models across heterogeneous devices and network environments. The framework incorporates a lightweight logical process synchronization mechanism that preserves causal ordering without requiring rollback, and seamlessly integrates with PyTorch Distributed to enable smooth transition from simulation to real-world deployment. Experimental results demonstrate that the simulator achieves a prediction accuracy of 98.6% across diverse hardware and backend configurations, significantly enhancing the reproducibility and scalability of research in distributed inference.
๐Ÿ“ Abstract
Developing and evaluating distributed inference algorithms remains difficult due to the lack of standardized tools for modeling heterogeneous devices and networks. Existing studies often rely on ad-hoc testbeds or proprietary infrastructure, making results hard to reproduce and limiting exploration of hypothetical hardware or network configurations. We present UNIFERENCE, a discrete-event simulation (DES) framework designed for developing, benchmarking, and deploying distributed AI models within a unified environment. UNIFERENCE models device and network behavior through lightweight logical processes that synchronize only on communication primitives, eliminating rollbacks while preserving the causal order. It integrates seamlessly with PyTorch Distributed, enabling the same codebase to transition from simulation to real deployment. Our evaluation demonstrates that UNIFERENCE profiles runtime with up to 98.6% accuracy compared to real physical deployments across diverse backends and hardware setups. By bridging simulation and deployment, UNIFERENCE provides an accessible, reproducible platform for studying distributed inference algorithms and exploring future system designs, from high-performance clusters to edge-scale devices. The framework is open-sourced at https://github.com/Dogacel/Uniference.
Problem

Research questions and friction points this paper is trying to address.

distributed inference
heterogeneous devices
network modeling
reproducibility
simulation framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Event Simulation
Distributed AI
PyTorch Distributed
Causal Order
Hardware-Software Co-simulation
๐Ÿ”Ž Similar Papers
No similar papers found.