🤖 AI Summary
To address the high cost, latency, and data leakage risks associated with cloud-based large language models (LLMs) in security operations center (SOC) and computer security incident response team (CSIRT) event classification, this work investigates the feasibility of deploying localized small language models (SLMs). We systematically evaluate 21 SLMs—ranging from 1B to 20B parameters—across two GPU architectures, analyzing the impact of temperature hyperparameter tuning, model scale, and hardware capability on classification accuracy and inference efficiency. Results show that temperature adjustment has negligible effect on accuracy; instead, model parameter count and GPU computational capacity are the dominant performance determinants. Models ≥13B parameters achieve >92% accuracy and sub-second latency on mid-to-high-end GPUs. This study provides the first quantitative characterization of performance boundaries and deployment guidelines for local SLMs in security event classification, offering empirical evidence and engineering insights for privacy-sensitive, lightweight AI-driven security automation.
📝 Abstract
SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks. We investigate whether locally executed SLMs can meet this challenge. We evaluated 21 models ranging from 1B to 20B parameters, varying the temperature hyperparameter and measuring execution time and precision across two distinct architectures. The results indicate that temperature has little influence on performance, whereas the number of parameters and GPU capacity are decisive factors.