FlowletFormer: Network Behavioral Semantic Aware Pre-training Model for Traffic Classification
Liu, Liming, Li, Ruoyu, Li, Qing, Hou, Meijia, Jiang, Yong, Xu, Mingwei
–arXiv.org Artificial Intelligence
Network traffic classification using pre-training models has shown promising results, but existing methods struggle to capture packet structural characteristics, flow-level behaviors, hierarchical protocol semantics, and inter-packet contextual relationships. To address these challenges, we propose FlowletFormer, a BERT -based pre-training model specifically designed for network traffic analysis. FlowletFormer introduces a Coherent Behavior-A ware Traffic Representation Model for segmenting traffic into semantically meaningful units, a Protocol Stack Alignment-Based Embedding Layer to capture multilayer protocol semantics, and Field-Specific and Context-A ware Pretraining Tasks to enhance both inter-packet and inter-flow learning. Experimental results demonstrate that FlowletFormer significantly outperforms existing methods in the effectiveness of traffic representation, classification accuracy, and few-shot learning capability. Moreover, by effectively integrating domain-specific network knowledge, FlowletFormer shows better comprehension of the principles of network transmission (e.g., stateful connections of TCP), providing a more robust and trustworthy framework for traffic analysis.
arXiv.org Artificial Intelligence
Aug-28-2025
- Country:
- Europe (1.00)
- Asia (0.93)
- North America > United States
- California (0.68)
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: