Synthetic Data: AI's New Weapon Against Android Malware
Nogueira, Angelo Gaspar Diniz, Paim, Kayua Oleques, Bragança, Hendrio, Mansilha, Rodrigo Brandão, Kreutz, Diego
–arXiv.org Artificial Intelligence
The ever-increasing number of Android devices and the accelerated evolution of malware, reaching over 35 million samples by 2024, highlight the critical importance of effective detection methods. Attackers are now using Artificial Intelligence to create sophisticated malware variations that can easily evade traditional detection techniques. Although machine learning has shown promise in malware classification, its success relies heavily on the availability of up-to-date, high-quality datasets. The scarcity and high cost of obtaining and labeling real malware samples presents significant challenges in developing robust detection models. In this paper, we propose MalSynGen, a Malware Synthetic Data Generation methodology that uses a conditional Generative Adversarial Network (cGAN) to generate synthetic tabular data. This data preserves the statistical properties of real-world data and improves the performance of Android malware classifiers. We evaluated the effectiveness of this approach using various datasets and metrics that assess the fidelity of the generated data, its utility in classification, and the computational efficiency of the process. Our experiments demonstrate that MalSynGen can generalize across different datasets, providing a viable solution to address the issues of obsolescence and low quality data in malware detection. With approximately 3 billion Android devices in operation worldwide [1], the mobile cybersecurity landscape faces formidable challenges. In 2024 alone, Kaspersky reported over 33.3 million cyberattacks targeting smartphone users globally, encompassing diverse forms of malware and unwanted software [2]. Adding to this problem, attackers are using Artificial Intelligence (AI) to rapidly generate new malware variants by exploiting patterns learned from existing malware [3].
arXiv.org Artificial Intelligence
Nov-26-2025
- Country:
- North America > United States
- Georgia > Fulton County > Atlanta (0.04)
- South America > Brazil
- Rio Grande do Sul > Porto Alegre (0.05)
- North America > United States
- Genre:
- Research Report > New Finding (0.94)
- Industry:
- Government > Military
- Cyberwarfare (0.68)
- Information Technology > Security & Privacy (1.00)
- Government > Military
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning
- Neural Networks > Deep Learning (0.68)
- Performance Analysis > Accuracy (1.00)
- Statistical Learning (1.00)
- Communications > Mobile (1.00)
- Security & Privacy (1.00)
- Artificial Intelligence > Machine Learning
- Information Technology