🤖 AI Summary
This study addresses the high manual overhead faced by DevOps teams in managing multi-interface cloud infrastructures. We propose and systematically evaluate an LLM-driven AI agent framework for automation. Methodologically, the agent unifies heterogeneous interfaces—including SDKs, CLIs, Infrastructure-as-Code (IaC) tools, and web portals—to support core tasks such as configuration deployment, monitoring/alerting, and incident remediation. Key contributions include: (1) the first evaluation framework specifically designed for AI agents in cloud infrastructure management; (2) identification and systematic mitigation of three critical bottlenecks—interface semantic gaps, action execution reliability, and security constraint compliance; and (3) domain-specific optimization strategies validated in real-world deployments, demonstrating both task feasibility and cross-scenario generalizability. Our work establishes a reusable methodology and empirical benchmark for AI-native cloud operations.
📝 Abstract
Cloud infrastructure is the cornerstone of the modern IT industry. However, managing this infrastructure effectively requires considerable manual effort from the DevOps engineering team. We make a case for developing AI agents powered by large language models (LLMs) to automate cloud infrastructure management tasks. In a preliminary study, we investigate the potential for AI agents to use different cloud/user interfaces such as software development kits (SDK), command line interfaces (CLI), Infrastructure-as-Code (IaC) platforms, and web portals. We report takeaways on their effectiveness on different management tasks, and identify research challenges and potential solutions.