HFL-FlowLLM: Large Language Models for Network Traffic Flow Classification in Heterogeneous Federated Learning

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In 5G and IoT environments, conventional network traffic classification methods suffer from poor generalizability and high communication overhead due to heterogeneous data distributions and privacy sensitivity. To address these challenges, this paper proposes the first lightweight large language model (LLM)-enhanced federated learning framework tailored for traffic classification. Our method introduces an LLM-based local feature extraction module, specifically designed to capture traffic semantics, alongside a heterogeneous model aggregation mechanism—enabling effective knowledge sharing while preserving data privacy. Experimental results demonstrate that our approach achieves an average 13% improvement in F1-score over state-of-the-art baselines. In multi-client settings, it attains up to a 5% higher F1-score and reduces training cost by 87% compared to mainstream LLM-based federated frameworks, while exhibiting superior robustness.

Technology Category

Application Category

📝 Abstract

In modern communication networks driven by 5G and the Internet of Things (IoT), effective network traffic flow classification is crucial for Quality of Service (QoS) management and security. Traditional centralized machine learning struggles with the distributed data and privacy concerns in these heterogeneous environments, while existing federated learning approaches suffer from high costs and poor generalization. To address these challenges, we propose HFL-FlowLLM, which to our knowledge is the first framework to apply large language models to network traffic flow classification in heterogeneous federated learning. Compared to state-of-the-art heterogeneous federated learning methods for network traffic flow classification, the proposed approach improves the average F1 score by approximately 13%, demonstrating compelling performance and strong robustness. When compared to existing large language models federated learning frameworks, as the number of clients participating in each training round increases, the proposed method achieves up to a 5% improvement in average F1 score while reducing the training costs by about 87%. These findings prove the potential and practical value of HFL-FlowLLM in modern communication networks security.

Problem

Research questions and friction points this paper is trying to address.

Classifying network traffic flows in 5G/IoT heterogeneous environments

Overcoming data privacy and generalization issues in federated learning

Reducing computational costs while improving classification accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large language models for network traffic classification

Heterogeneous federated learning framework implementation

Improved F1 scores with reduced training costs

🔎 Similar Papers

No similar papers found.

Authors to Follow