Goto

Collaborating Authors

 computer network



Unsupervised Dataset Cleaning Framework for Encrypted Traffic Classification

arXiv.org Artificial Intelligence

Traffic classification, a technique for assigning network flows to predefined categories, has been widely deployed in enterprise and carrier networks. With the massive adoption of mobile devices, encryption is increasingly used in mobile applications to address privacy concerns. Consequently, traditional methods such as Deep Packet Inspection (DPI) fail to distinguish encrypted traffic. To tackle this challenge, Artificial Intelligence (AI), in particular Machine Learning (ML), has emerged as a promising solution for encrypted traffic classification. A crucial prerequisite for any ML-based approach is traffic data cleaning, which removes flows that are not useful for training (e.g., irrelevant protocols, background activity, control-plane messages, and long-lived sessions). Existing cleaning solutions depend on manual inspection of every captured packet, making the process both costly and time-consuming. In this poster, we present an unsupervised framework that automatically cleans encrypted mobile traffic. Evaluation on real-world datasets shows that our framework incurs only a 2%~2.5% reduction in classification accuracy compared with manual cleaning. These results demonstrate that our method offers an efficient and effective preprocessing step for ML-based encrypted traffic classification.


AIhub monthly digest: August 2025 โ€“ causality and generative modelling, responsible multimodal AI, and IJCAI in Montrรฉal and Guangzhou

AIHub

Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we dive into the world of agents, learn about responsible multimodal AI, apply generative AI to computer networks, and dig into the RoboCup@Work League. This month, Sanmay Das, Tom Dietterich, Sabine Hauert, Sarit Kraus, and Michael Littman tackled the topic of agentic AI, discussing recent developments, and lessons learned from the decades of research in the autonomous agents and multiagent systems community. The 34th International Joint Conference on Artificial Intelligence (IJCAI2025) took place in Montrรฉal from 16-22 August, with a satellite event currently being held (from 29-31 August) in Guangzhou, China. You can find out more about the programmes of both venues here, and get a flavour of what attendees got up to in our social media round-ups: Part one Part two.


Appendix A CommonsenseQA Error Patterns Throughout our experiments, we came across a variety of interesting failure cases for commonse

Neural Information Processing Systems

One key failure case was answers in the form of "the answer must be something that is ." Q: What might someone get from learning about science? Answer Choices: (a) headache (b) see things differently (c) increased knowledge (d) accidents (e) appreciation of nature A: The answer must be something that someone would get from learning about science. Learning about science would increase knowledge. Therefore, the answer is increased knowledge (c). These rationales, while perhaps useful to the model, read to us as opaque and unexplanatory. Sometimes the model will imply the answer that it has chosen in its question. Q: The christian rock band was on a large stage, the singer's foot got caught on a wire and he had a small what? Answer Choices: (a) commit sin (b) fail (c) disobey (d) stumble (e) boo boo A: The answer must be something that would cause the singer to stumble. The singer's foot got caught on a wire, so he would stumble.


Interview with Shaghayegh (Shirley) Shajarian: Applying generative AI to computer networks

AIHub

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. This time, we hear from Shaghayegh (Shirley) Shajarian and learn about her research applying generative AI to computer networks. I am a third-year PhD student in the Computer Science department at North Carolina A&T State University, working under Dr Sajad Khorsandroo and Dr Mahmoud Abdelsalam. I am part of the Autonomous Cybersecurity and Resilience Lab, where my research focuses on applying generative AI to computer networks. I am developing AI-driven agents that assist with some network operations, such as log analysis, troubleshooting, and documentation.


Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation

arXiv.org Artificial Intelligence

Self-training approach for large language models (LLMs) improves reasoning abilities by training the models on their self-generated rationales. Previous approaches have labeled rationales that produce correct answers for a given question as appropriate for training. However, a single measure risks misjudging rationale quality, leading the models to learn flawed reasoning patterns. To address this issue, we propose CREST (Consistency-driven Rationale Evaluation for Self-Training), a self-training framework that further evaluates each rationale through follow-up questions and leverages this evaluation to guide its training. Specifically, we introduce two methods: (1) filtering out rationales that frequently result in incorrect answers on follow-up questions and (2) preference learning based on mixed preferences from rationale evaluation results of both original and follow-up questions. Experiments on three question-answering datasets using open LLMs show that CREST not only improves the logical robustness and correctness of rationales but also improves reasoning abilities compared to previous self-training approaches.


Towards Characterizing Cyber Networks with Large Language Models

arXiv.org Artificial Intelligence

Threat hunting analyzes large, noisy, high-dimensional data to find sparse adversarial behavior. We believe adversarial activities, however they are disguised, are extremely difficult to completely obscure in high dimensional space. In this paper, we employ these latent features of cyber data to find anomalies via a prototype tool called Cyber Log Embeddings Model (CLEM). CLEM was trained on Zeek network traffic logs from both a real-world production network and an from Internet of Things (IoT) cybersecurity testbed. The model is deliberately overtrained on a sliding window of data to characterize each window closely. We use the Adjusted Rand Index (ARI) to comparing the k-means clustering of CLEM output to expert labeling of the embeddings. Our approach demonstrates that there is promise in using natural language modeling to understand cyber data.


Rephrase and Contrast: Fine-Tuning Language Models for Enhanced Understanding of Communication and Computer Networks

arXiv.org Artificial Intelligence

Large language models (LLMs) are being widely researched across various disciplines, with significant recent efforts focusing on adapting LLMs for understanding of how communication networks operate. However, over-reliance on prompting techniques hinders the full exploitation of the generalization ability of these models, and the lack of efficient fine-tuning methods prevents the full realization of lightweight LLMs' potential. This paper addresses these challenges by introducing our Rephrase and Contrast (RaC) framework, an efficient fine-tuning framework. RaC enhances LLMs' comprehension and critical thinking abilities by incorporating question reformulation and contrastive analysis of correct and incorrect answers during the fine-tuning process. Experimental results demonstrate a 63.73% accuracy improvement over the foundational model when tested on a comprehensive networking problem set. Moreover, to efficiently construct the dataset for RaC fine-tuning, we develop a GPT-assisted data mining method for generating high-quality question-answer (QA) pairs; furthermore, we introduce ChoiceBoost, a data augmentation technique that expands dataset size while reducing answer-order bias. Apart from these technical innovations, we contribute to the networking community by open-sourcing valuable research resources, including: 1) the fine-tuned networking model referred to as RaC-Net, 2) the training dataset used for fine-tuning the model, 3) three testing problem sets of different difficulties to serve as benchmarks for future research, and 4) code associated with the above resources.


Preliminary study on artificial intelligence methods for cybersecurity threat detection in computer networks based on raw data packets

arXiv.org Artificial Intelligence

Most of the intrusion detection methods in computer networks are based on traffic flow characteristics. However, this approach may not fully exploit the potential of deep learning algorithms to directly extract features and patterns from raw packets. Moreover, it impedes real-time monitoring due to the necessity of waiting for the processing pipeline to complete and introduces dependencies on additional software components. In this paper, we investigate deep learning methodologies capable of detecting attacks in real-time directly from raw packet data within network traffic. We propose a novel approach where packets are stacked into windows and separately recognised, with a 2D image representation suitable for processing with computer vision models. Our investigation utilizes the CIC IDS-2017 dataset, which includes both benign traffic and prevalent real-world attacks, providing a comprehensive foundation for our research.


Can LLMs Understand Computer Networks? Towards a Virtual System Administrator

arXiv.org Artificial Intelligence

Recent advancements in Artificial Intelligence, and particularly Large Language Models (LLMs), offer promising prospects for aiding system administrators in managing the complexity of modern networks. However, despite this potential, a significant gap exists in the literature regarding the extent to which LLMs can understand computer networks. Without empirical evidence, system administrators might rely on these models without assurance of their efficacy in performing network-related tasks accurately. In this paper, we are the first to conduct an exhaustive study on LLMs' comprehension of computer networks. We formulate several research questions to determine whether LLMs can provide correct answers when supplied with a network topology and questions on it. To assess them, we developed a thorough framework for evaluating LLMs' capabilities in various network-related tasks. We evaluate our framework on multiple computer networks employing private (e.g., GPT4) and open-source (e.g., Llama2) models. Our findings demonstrate promising results, with the best model achieving an average accuracy of 79.3%. Private LLMs achieve noteworthy results in small and medium networks, while challenges persist in comprehending complex network topologies, particularly for open-source models. Moreover, we provide insight into how prompt engineering can enhance the accuracy of some tasks.