🤖 AI Summary
This work addresses the challenge of balancing detection performance and adaptability to unknown attacks in traditional cloud intrusion detection systems operating within multi-layered architectures. The authors propose a confidence-aware, three-tier collaborative detection pipeline that integrates machine learning models across the network, host, and hypervisor layers. The framework incorporates adaptive Q-learning–based threshold calibration, ChromaDB vector memory matching, and large language model (LLM)–driven semantic reasoning with explainable output generation. A multi-level confidence gating mechanism prevents forced classification of uncertain samples and enables knowledge feedback for continuous learning. Evaluated on real-world scenarios, the system achieves an overall accuracy of 88.68% (F1: 85.00%), with over 97% accuracy at the high-confidence tier, while reducing LLM invocations by 58.78%. This approach delivers high detection efficacy with significantly lower computational overhead, yielding a low false-positive, interpretable, and self-evolving cloud security defense.
📝 Abstract
Security in cloud computing has become a major concern due to several factors such as layered cloud architectures, dynamic environments, and exposure to unseen or zero-day attacks. Moreover, intrusion detection systems (IDS) typically operate at specific layers and rely heavily on machine learning models, which often perform well in experimental settings but fail to sustain performance in real cloud deployments. In this work, we implement a confidence-aware multilevel intrusion detection system using reinforcement learning tailored for cloud environments. The system secures three distinct layers: network, host, and hypervisor. Machine learning models at each layer detect known attack patterns, while prediction confidence distinguishes reliable decisions from uncertain outcomes. Within the multi-gate flow, low-confidence events pass through a learned-threshold confidence gate (Gate-1), followed by a Chroma memory-matching gate (Gate-2), with unresolved events escalated to a large language model (LLM) for semantic analysis and explanation. Final attack promotion at Gate-3 uses calibrated LLM confidence or weighted-fusion fallback, while uncertain events are retained in a review bucket to avoid forced classification. Generated explanations and confirmed knowledge are stored in ChromaDB to support future analysis and retraining. The approach is first evaluated using static thresholds, establishing a baseline for comparison. Results show that the proposed system learns adaptive thresholds and reduces LLM escalation by 58.78%, lowering cost while maintaining strong performance (88.68% accuracy, 85.29% precision, 84.72% recall, 85.00% F1). The network and hypervisor layers achieve 98.02% and 97.08% accuracy, demonstrating a balanced and efficient detection system.