EventHunter: Dynamic Clustering and Ranking of Security Events from Hacker Forum Discussions
Ech-Chammakhy, Yasir, Motii, Anas, Rabii, Anass, Chbili, Jaafar
–arXiv.org Artificial Intelligence
Hacker forums provide critical early warning signals for emerging cybersecurity threats, but extracting actionable intelligence from their unstructured and noisy content remains a significant challenge. This paper presents an unsupervised framework that automatically detects, clusters, and prioritizes security events discussed across hacker forum posts. Our approach leverages Transformer-based embeddings fine-tuned with contrastive learning to group related discussions into distinct security event clusters, identifying incidents like zero-day disclosures or malware releases without relying on predefined keywords. The framework incorporates a daily ranking mechanism that prioritizes identified events using quantifiable metrics reflecting timeliness, source credibility, information completeness, and relevance. Experimental evaluation on real-world hacker forum data demonstrates that our method effectively reduces noise and surfaces high-priority threats, enabling security analysts to mount proactive responses. By transforming disparate hacker forum discussions into structured, actionable intelligence, our work addresses fundamental challenges in automated threat detection and analysis.
arXiv.org Artificial Intelligence
Jul-15-2025
- Country:
- Africa > Middle East
- Morocco > Casablanca-Settat Region > Casablanca (0.04)
- Asia
- China > Zhejiang Province
- Hangzhou (0.04)
- Malaysia > Kuala Lumpur
- Kuala Lumpur (0.04)
- China > Zhejiang Province
- Europe
- France > Île-de-France
- Germany > North Rhine-Westphalia
- Cologne Region > Cologne (0.04)
- Italy > Tuscany
- Florence (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States > Louisiana
- Orleans Parish > New Orleans (0.04)
- Mexico > Mexico City
- Africa > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Government > Military
- Cyberwarfare (0.70)
- Information Technology > Security & Privacy (1.00)
- Government > Military
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (0.93)
- Natural Language
- Large Language Model (1.00)
- Text Processing (1.00)
- Representation & Reasoning (0.93)
- Machine Learning
- Communications > Social Media (1.00)
- Data Science > Data Mining (1.00)
- Security & Privacy (1.00)
- Artificial Intelligence
- Information Technology