Leveraging VAE-Derived Latent Spaces for Enhanced Malware Detection with Machine Learning Classifiers
Ajayi, Bamidele, Barakat, Basel, McGarry, Ken
–arXiv.org Artificial Intelligence
--This paper assesses the performance of five machine learning classifiers: Decision Tree, Naive Bayes, LightGBM, Logistic Regression, and Random Forest using latent representations learned by a V ariational Autoencoder from malware datasets. Results from the experiments conducted on different training-test splits with different random seeds reveal that all the models perform well in detecting malware with ensemble methods (LightGBM and Random Forest) performing slightly better than the rest. In addition, the use of latent features reduces the computational cost of the model and the need for extensive hyperparameter tuning for improved efficiency of the model for deployment. Statistical tests show that these improvements are significant, and thus, the practical relevance of integrating latent space representation with traditional classifiers for effective malware detection in cybersecurity is established. In today's hyperconnected world, malware attacks have risen to concerning proportions, presenting substantial challenges for cybersecurity. Sophisticated malware variants, such as viruses, worms, and ransomware, are progressively adept at circumventing traditional detection methods. The increasing complexity of these threats--spanning financial losses to critical infrastructure breaches--demands the creation of more resilient and adaptive strategies for malware detection and classification.
arXiv.org Artificial Intelligence
Mar-24-2025
- Country:
- Europe > Switzerland
- Basel-City > Basel (0.04)
- Asia
- Middle East > UAE
- Dubai Emirate > Dubai (0.04)
- China > Beijing
- Beijing (0.04)
- Middle East > UAE
- Europe > Switzerland
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology:
- Information Technology > Artificial Intelligence > Machine Learning
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (0.92)
- Statistical Learning (0.92)
- Decision Tree Learning (0.84)
- Information Technology > Artificial Intelligence > Machine Learning