🤖 AI Summary
Traditional intrusion detection systems struggle to identify zero-day anomalies, particularly exhibiting poor generalization across domains. To address this, we propose a two-stage multi-task representation learning framework that jointly optimizes domain-invariant feature extraction and latent variable disentanglement in the latent space. Our key innovation is an explicit latent-space mutual information minimization mechanism, which decouples spurious feature correlations and enables unified anomaly detection for both in-distribution (IN) and out-of-distribution (OOD) network traffic. The method integrates domain-invariant representation learning, cross-domain feature alignment, and disentanglement regularization. Evaluated on multiple cybersecurity benchmarks, our approach achieves significantly higher IN/OOD detection accuracy than state-of-the-art domain generalization methods, demonstrating superior generalization capability and robustness against distribution shifts.
📝 Abstract
Domain generalization focuses on leveraging knowledge from multiple related domains with ample training data and labels to enhance inference on unseen in-distribution (IN) and out-of-distribution (OOD) domains. In our study, we introduce a two-phase representation learning technique using multi-task learning. This approach aims to cultivate a latent space from features spanning multiple domains, encompassing both native and cross-domains, to amplify generalization to IN and OOD territories. Additionally, we attempt to disentangle the latent space by minimizing the mutual information between the prior and latent space, effectively de-correlating spurious feature correlations. Collectively, the joint optimization will facilitate domain-invariant feature learning. We assess the model's efficacy across multiple cybersecurity datasets, using standard classification metrics on both unseen IN and OOD sets, and juxtapose the results with contemporary domain generalization methods.