đ¤ AI Summary
This study addresses the event classification task in Security Operations Centers (SOCs), evaluating the practical applicability of open-source small language models (SLMs) versus commercial large language models (LLMs) under constraints of privacy preservation, cost efficiency, and data sovereignty. Method: Using real-world anonymized security incident data and the NIST SP 800-61r3 classification framework, we systematically design and compare five prompt engineering strategies: PHP, SHP, HTP, PRP, and ZSL. Contribution/Results: Commercial LLMs achieve marginally higher accuracy, but locally deployed open-source SLMs attain 92% of their F1-score while substantially improving data control, reducing inference costs by over 70%, and eliminating cloud transmissionâassociated privacy risks. To our knowledge, this is the first empirical validationâwithin a production SOC environmentâof lightweight open-source models augmented with domain-specific prompt engineering. Our findings establish a new paradigm for deploying AI in high-sensitivity domains, balancing security, compliance, and operational practicality.
đ Abstract
In this study, we evaluate open-source models for security incident classification, comparing them with proprietary models. We utilize a dataset of anonymized real incidents, categorized according to the NIST SP 800-61r3 taxonomy and processed using five prompt-engineering techniques (PHP, SHP, HTP, PRP, and ZSL). The results indicate that, although proprietary models still exhibit higher accuracy, locally deployed open-source models provide advantages in privacy, cost-effectiveness, and data sovereignty.