Machine Learning (In) Security: A Stream of Problems
Ceschin, Fabrício, Botacin, Marcus, Bifet, Albert, Pfahringer, Bernhard, Oliveira, Luiz S., Gomes, Heitor Murilo, Grégio, André
–arXiv.org Artificial Intelligence
Machine Learning (ML) has been widely applied to cybersecurity and is considered state-of-the-art for solving many of the open issues in that field. However, it is very difficult to evaluate how good the produced solutions are, since the challenges faced in security may not appear in other areas. One of these challenges is the concept drift, which increases the existing arms race between attackers and defenders: malicious actors can always create novel threats to overcome the defense solutions, which may not consider them in some approaches. Due to this, it is essential to know how to properly build and evaluate an ML-based security solution. In this paper, we identify, detail, and discuss the main challenges in the correct application of ML techniques to cybersecurity data. We evaluate how concept drift, evolution, delayed labels, and adversarial ML impact the existing solutions. Moreover, we address how issues related to data collection affect the quality of the results presented in the security literature, showing that new strategies are needed to improve current solutions. Finally, we present how existing solutions may fail under certain circumstances, and propose mitigations to them, presenting a novel checklist to help the development of future ML solutions for cybersecurity.
arXiv.org Artificial Intelligence
Sep-4-2023
- Country:
- South America > Brazil
- Oceania > New Zealand
- North Island
- Wellington Region > Wellington (0.04)
- Waikato > Hamilton (0.04)
- North Island
- North America
- United States
- Florida > Orange County
- Orlando (0.04)
- North Carolina > Wake County
- Raleigh (0.04)
- Colorado > Denver County
- Denver (0.04)
- Texas
- Brazos County > College Station (0.14)
- Travis County > Austin (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County > Cambridge (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Clara County > Santa Clara (0.04)
- San Mateo County > Burlingame (0.04)
- New York > New York County
- New York City (0.14)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Florida > Orange County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Ontario > Toronto (0.14)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- Austria > Vienna (0.14)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Greece > Attica
- Athens (0.04)
- Germany
- Berlin (0.04)
- Bavaria > Upper Bavaria
- Ingolstadt (0.04)
- Asia
- Middle East > Jordan (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Africa > Cameroon
- Far North Region > Maroua (0.04)
- Genre:
- Research Report > New Finding (0.45)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military
- Cyberwarfare (0.91)
- Technology: