🤖 AI Summary
This work addresses a critical limitation in existing encrypted traffic analysis methods, which often neglect the inherent hierarchical structure embedded in protocol semantics and specifications. To overcome this, the authors propose a novel approach that explicitly models protocol fields as a hierarchical graph and integrates graph attention networks with a Mixture-of-Experts (MoE) architecture. This framework enables semantically aware traffic classification under a strict no-data-leakage setting. The method achieves state-of-the-art performance across multiple benchmark datasets, significantly outperforming current best models. Furthermore, it offers enhanced interpretability through its field-level graph structure and expert-specific weights, revealing both the most discriminative protocol features and the contributions of individual experts to the final classification decision.
📝 Abstract
Graph-based deep learning methods have been widely employed in encrypted traffic analysis to exploit latent correlations across different granularities. However, while complex preprocessing pipelines and sophisticated model structures often achieve strong performance, they may obscure inherent protocol semantics during representation learning. Moreover, the hierarchical structure of protocol layers and their corresponding fields, defined by protocol specifications and routinely utilized in manual traffic analysis, remains underexplored in existing learning frameworks. In this paper, we propose Protocol Tree Graph Attention with Mixture of Experts (PTGAMoE), a semantic-preserving hierarchical graph-based expert framework for encrypted traffic analysis. The field-based graph construction and expert committee design enable PTGAMoE to quantify the model's preferences for specific fields and protocols. Extensive experimental results on representative benchmark datasets under strict no-data-leakage settings demonstrate that PTGAMoE significantly outperforms state-of-the-art (SOTA) models. Furthermore, the semantic-preserving design provides interpretable insights into protocol-level feature importance and expert-level contributions, reflecting the model's decision-making logic in encrypted traffic classification tasks.