🤖 AI Summary
In enterprise network engineering, physical topology modifications and device configuration updates have long relied on error-prone, inefficient manual processes; existing automation research predominantly focuses on configuration synthesis while neglecting co-evolution with topology changes. This paper proposes the first intent-driven, closed-loop automation framework tailored for enterprise networks. It integrates multimodal large language models (MLLMs), optical character recognition (OCR), and a graph-structure-aware visual encoder to jointly understand topology diagrams and textual intent specifications. We introduce a novel topology–configuration co-prompting engineering paradigm and a Cisco-certified scenario fine-tuning mechanism. Evaluated on real-world enterprise deployments, our framework achieves significantly improved topology image parsing accuracy, reduces network design cycle time by over 40%, and attains an 89.2% execution accuracy for topology-modification intents—substantially decreasing manual intervention.
📝 Abstract
Communication network engineering in enterprise environments is traditionally a complex, time-consuming, and error-prone manual process. Most research on network engineering automation has concentrated on configuration synthesis, often overlooking changes in the physical network topology. This paper introduces GeNet, a multimodal co-pilot for enterprise network engineers. GeNet is a novel framework that leverages a large language model (LLM) to streamline network design workflows. It uses visual and textual modalities to interpret and update network topologies and device configurations based on user intents. GeNet was evaluated on enterprise network scenarios adapted from Cisco certification exercises. Our results demonstrate GeNet's ability to interpret network topology images accurately, potentially reducing network engineers' efforts and accelerating network design processes in enterprise environments. Furthermore, we show the importance of precise topology understanding when handling intents that require modifications to the network's topology.