Moubayed, Abdallah
A Deep Learning Approach Towards Student Performance Prediction in Online Courses: Challenges Based on a Global Perspective
Moubayed, Abdallah, Injadat, MohammadNoor, Alhindawi, Nouh, Samara, Ghassan, Abuasal, Sara, Alazaidah, Raed
Analyzing and evaluating students' progress in any learning environment is stressful and time consuming if done using traditional analysis methods. This is further exasperated by the increasing number of students due to the shift of focus toward integrating the Internet technologies in education and the focus of academic institutions on moving toward e-Learning, blended, or online learning models. As a result, the topic of student performance prediction has become a vibrant research area in recent years. To address this, machine learning and data mining techniques have emerged as a viable solution. To that end, this work proposes the use of deep learning techniques (CNN and RNN-LSTM) to predict the students' performance at the midpoint stage of the online course delivery using three distinct datasets collected from three different regions of the world. Experimental results show that deep learning models have promising performance as they outperform other optimized traditional ML models in two of the three considered datasets while also having comparable performance for the third dataset.
Forensic Data Analytics for Anomaly Detection in Evolving Networks
Yang, Li, Moubayed, Abdallah, Shami, Abdallah, Boukhtouta, Amine, Heidari, Parisa, Preda, Stere, Brunner, Richard, Migault, Daniel, Larabi, Adel
In the prevailing convergence of traditional infrastructure-based deployment (i.e., Telco and industry operational networks) towards evolving deployments enabled by 5G and virtualization, there is a keen interest in elaborating effective security controls to protect these deployments in-depth. By considering key enabling technologies like 5G and virtualization, evolving networks are democratized, facilitating the establishment of point presences integrating different business models ranging from media, dynamic web content, gaming, and a plethora of IoT use cases. Despite the increasing services provided by evolving networks, many cybercrimes and attacks have been launched in evolving networks to perform malicious activities. Due to the limitations of traditional security artifacts (e.g., firewalls and intrusion detection systems), the research on digital forensic data analytics has attracted more attention. Digital forensic analytics enables people to derive detailed information and comprehensive conclusions from different perspectives of cybercrimes to assist in convicting criminals and preventing future crimes. This chapter presents a digital analytics framework for network anomaly detection, including multi-perspective feature engineering, unsupervised anomaly detection, and comprehensive result correction procedures. Experiments on real-world evolving network data show the effectiveness of the proposed forensic data analytics solution.
Multi-Perspective Content Delivery Networks Security Framework Using Optimized Unsupervised Anomaly Detection
Yang, Li, Moubayed, Abdallah, Shami, Abdallah, Heidari, Parisa, Boukhtouta, Amine, Larabi, Adel, Brunner, Richard, Preda, Stere, Migault, Daniel
Content delivery networks (CDNs) provide efficient content distribution over the Internet. CDNs improve the connectivity and efficiency of global communications, but their caching mechanisms may be breached by cyber-attackers. Among the security mechanisms, effective anomaly detection forms an important part of CDN security enhancement. In this work, we propose a multi-perspective unsupervised learning framework for anomaly detection in CDNs. In the proposed framework, a multi-perspective feature engineering approach, an optimized unsupervised anomaly detection model that utilizes an isolation forest and a Gaussian mixture model, and a multi-perspective validation method, are developed to detect abnormal behaviors in CDNs mainly from the client Internet Protocol (IP) and node perspectives, therefore to identify the denial of service (DoS) and cache pollution attack (CPA) patterns. Experimental results are presented based on the analytics of eight days of real-world CDN log data provided by a major CDN operator. Through experiments, the abnormal contents, compromised nodes, malicious IPs, as well as their corresponding attack types, are identified effectively by the proposed framework and validated by multiple cybersecurity experts. This shows the effectiveness of the proposed method when applied to real-world CDN data.
MTH-IDS: A Multi-Tiered Hybrid Intrusion Detection System for Internet of Vehicles
Yang, Li, Moubayed, Abdallah, Shami, Abdallah
Modern vehicles, including connected vehicles and autonomous vehicles, nowadays involve many electronic control units connected through intra-vehicle networks to implement various functionalities and perform actions. Modern vehicles are also connected to external networks through vehicle-to-everything technologies, enabling their communications with other vehicles, infrastructures, and smart devices. However, the improving functionality and connectivity of modern vehicles also increase their vulnerabilities to cyber-attacks targeting both intra-vehicle and external networks due to the large attack surfaces. To secure vehicular networks, many researchers have focused on developing intrusion detection systems (IDSs) that capitalize on machine learning methods to detect malicious cyber-attacks. In this paper, the vulnerabilities of intra-vehicle and external networks are discussed, and a multi-tiered hybrid IDS that incorporates a signature-based IDS and an anomaly-based IDS is proposed to detect both known and unknown attacks on vehicular networks. Experimental results illustrate that the proposed system can detect various types of known attacks with 99.99% accuracy on the CAN-intrusion-dataset representing the intra-vehicle network data and 99.88% accuracy on the CICIDS2017 dataset illustrating the external vehicular network data. For the zero-day attack detection, the proposed system achieves high F1-scores of 0.963 and 0.800 on the above two datasets, respectively. The average processing time of each data packet on a vehicle-level machine is less than 0.6 ms, which shows the feasibility of implementing the proposed system in real-time vehicle systems. This emphasizes the effectiveness and efficiency of the proposed IDS.
Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities
Injadat, MohammadNoor, Moubayed, Abdallah, Nassif, Ali Bou, Shami, Abdallah
The emergence and continued reliance on the Internet and related technologies has resulted in the generation of large amounts of data that can be made available for analyses. However, humans do not possess the cognitive capabilities to understand such large amounts of data. Machine learning (ML) provides a mechanism for humans to process large amounts of data, gain insights about the behavior of the data, and make more informed decision based on the resulting analysis. ML has applications in various fields. This review focuses on some of the fields and applications such as education, healthcare, network security, banking and finance, and social media. Within these fields, there are multiple unique challenges that exist. However, ML can provide solutions to these challenges, as well as create further research opportunities. Accordingly, this work surveys some of the challenges facing the aforementioned fields and presents some of the previous literature works that tackled them. Moreover, it suggests several research opportunities that benefit from the use of ML to address these challenges.
Ensemble-based Feature Selection and Classification Model for DNS Typo-squatting Detection
Moubayed, Abdallah, Aqeeli, Emad, Shami, Abdallah
Domain Name System (DNS) plays in important role in the current IP-based Internet architecture. This is because it performs the domain name to IP resolution. However, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it. This paper focuses on one particular security vulnerability, namely typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. The danger of typo-squatting is that it can lead to information threat, corporate secret leakage, and can facilitate fraud. This paper builds on our previous work in [1], which only proposed majority-voting based classifier, by proposing an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack. Experimental results show that the proposed framework achieves high accuracy and precision in identifying the malicious/suspicious typo-squatting domains (a loss of at most 1.5% in accuracy and 5% in precision when compared to the model that used the complete feature set) while having a lower computational complexity due to the smaller feature set (a reduction of more than 50% in feature set size).