Scoring the Unscorables: Cyber Risk Assessment Beyond Internet Scans
Sarabi, Armin, Karir, Manish, Liu, Mingyan
–arXiv.org Artificial Intelligence
In this paper we present a study on using novel data types to perform cyber risk quantification by estimating the likelihood of a data breach. We demonstrate that it is feasible to build a highly accurate cyber risk assessment model using public and readily available technology signatures obtained from crawling an organization's website. This approach overcomes the limitations of previous similar approaches that relied on large-scale IP address based scanning data, which suffers from incomplete/missing IP address mappings as well as the lack of such data for large numbers of small and medium-sized organizations (SMEs). In comparison to scan data, technology digital signature data is more readily available for millions of SMEs. Our study shows that there is a strong relationship between these technology signatures and an organization's cybersecurity posture. In cross-validating our model using different cyber incident datasets, we also highlight the key differences between ransomware attack victims and the larger population of cyber incident and data breach victims.
arXiv.org Artificial Intelligence
Jun-10-2025
- Country:
- Asia > Middle East
- Iran > East Azerbaijan Province > Tabriz (0.04)
- Europe > Netherlands
- South Holland > Rijswijk (0.04)
- North America > United States
- Michigan > Washtenaw County > Ann Arbor (0.14)
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Ensemble Learning (0.46)
- Neural Networks (0.46)
- Performance Analysis (0.46)
- Statistical Learning (0.46)
- Natural Language (1.00)
- Machine Learning
- Communications
- Networks (0.87)
- Social Media (1.00)
- Web (0.94)
- Data Science > Data Mining (1.00)
- Information Management > Search (1.00)
- Security & Privacy (1.00)
- Artificial Intelligence
- Information Technology