security event
EventHunter: Dynamic Clustering and Ranking of Security Events from Hacker Forum Discussions
Ech-Chammakhy, Yasir, Motii, Anas, Rabii, Anass, Chbili, Jaafar
Hacker forums provide critical early warning signals for emerging cybersecurity threats, but extracting actionable intelligence from their unstructured and noisy content remains a significant challenge. This paper presents an unsupervised framework that automatically detects, clusters, and prioritizes security events discussed across hacker forum posts. Our approach leverages Transformer-based embeddings fine-tuned with contrastive learning to group related discussions into distinct security event clusters, identifying incidents like zero-day disclosures or malware releases without relying on predefined keywords. The framework incorporates a daily ranking mechanism that prioritizes identified events using quantifiable metrics reflecting timeliness, source credibility, information completeness, and relevance. Experimental evaluation on real-world hacker forum data demonstrates that our method effectively reduces noise and surfaces high-priority threats, enabling security analysts to mount proactive responses. By transforming disparate hacker forum discussions into structured, actionable intelligence, our work addresses fundamental challenges in automated threat detection and analysis.
The Rate of Learning in Threat Detection
Historically, threat detection (e.g., rule-based intrusion detection, anti-virus systems, and threat intelligence feeds) has been reactive and involves flagging digital requests containing known signatures. These signatures are formalized post hoc, emanating from a compromise that has already happened and was then shared with others. Organizations have relied heavily on these tools--to their disadvantage. The figures below reflect the traditional threat detection paradigm of learning vicariously from peers and highlight how it is at a disadvantage against new or adaptive adversaries. There are some things worth remembering; certainly, past security events are one of them because mistakes are great teachers.
Cerberus: Exploring Federated Prediction of Security Events
Naseri, Mohammad, Han, Yufei, Mariconti, Enrico, Shen, Yun, Stringhini, Gianluca, De Cristofaro, Emiliano
Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. In this paper, we explore the feasibility of using Federated Learning (FL) to predict future security events. To this end, we introduce Cerberus, a system enabling collaborative training of Recurrent Neural Network (RNN) models for participating organizations. The intuition is that FL could potentially offer a middle-ground between the non-private approach where the training data is pooled at a central server and the low-utility alternative of only training local models. We instantiate Cerberus on a dataset obtained from a major security company's intrusion prevention product and evaluate it vis-a-vis utility, robustness, and privacy, as well as how participants contribute to and benefit from the system. Overall, our work sheds light on both the positive aspects and the challenges of using FL for this task and paves the way for deploying federated approaches to predictive security.
What Machine Learning Can Do for Security
Machine learning can be applied in various ways in security, for instance, in malware analysis, to make predictions, and for clustering security events. It can also be used to detect previously unknown attacks with no established signature. Wendy Edwards, a software developer interested in the intersection of cybersecurity and data science, spoke about applying machine learning to security at The Diana Initiative 2021. Artificial Intelligence (AI) can be applied to detect anomalies by finding unusual patterns. But unusual doesn't necessarily mean malicious, as Edwards explained: For example, maybe your web server is experiencing higher than usual traffic because something is trending on social media.
Applying The Power Of Deep Learning To Cybersecurity
Deep Instinct applies deep learning to cybersecurity--going beyond what machine learning can ... [ ] accomplish with a neural network designed to emulate the human brain and learn as it goes. Cyber attacks are not a new issue by any stretch of the imagination--but they are a rapidly growing threat. As the volume and types of technologies businesses and consumers use continues to expand, the attack surface--the configuration errors, vulnerabilities, human errors, or other weaknesses that increase the potential for a successful cyber attack--increases exponentially. To keep pace with the threat landscape, organizations need to rethink their approach to security. According to AVTest, there are more than 18,000 new malware and/or potentially unwanted applications identified every hour.
Applying The Power Of Deep Learning To Cybersecurity
Deep Instinct applies deep learning to cybersecurity--going beyond what machine learning can ... [ ] accomplish with a neural network designed to emulate the human brain and learn as it goes. Cyber attacks are not a new issue by any stretch of the imagination--but they are a rapidly growing threat. As the volume and types of technologies businesses and consumers use continues to expand, the attack surface--the configuration errors, vulnerabilities, human errors, or other weaknesses that increase the potential for a successful cyber attack--increases exponentially. To keep pace with the threat landscape, organizations need to rethink their approach to security. According to AVTest, there are more than 18,000 new malware and/or potentially unwanted applications identified every hour.
Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection
Sarhan, Mohanad, Layeghy, Siamak, Portmann, Marius
Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the cybersecurity posture of an organisation. Many systems have been designed and developed in the research community, often achieving a close to perfect detection rate when evaluated using synthetic datasets. However, the high number of academic research has not often translated into practical deployments. There are several causes contributing towards the wide gap between research and production, such as the limited ability of comprehensive evaluation of ML models and lack of understanding of internal ML operations. This paper tightens the gap by evaluating the generalisability of a common feature set to different network environments and attack scenarios. Therefore, two feature sets (NetFlow and CICFlowMeter) have been evaluated in terms of detection accuracy across three key datasets, i.e., CSE-CIC-IDS2018, BoT-IoT, and ToN-IoT. The results show the superiority of the NetFlow feature set in enhancing the ML models detection accuracy of various network attacks. In addition, due to the complexity of the learning models, SHapley Additive exPlanations (SHAP), an explainable AI methodology, has been adopted to explain and interpret the classification decisions of ML models. The Shapley values of two common feature sets have been analysed across multiple datasets to determine the influence contributed by each feature towards the final ML prediction.
The Problem with Artificial Intelligence in Security
If you believed everything you read, artificial intelligence (AI) is the savior of cybersecurity. According to Capgemini, 80% of companies are counting on AI to help identify threats and thwart attacks. That's a big ask to live up to because, in reality, few nonexperts really understand the value of AI to security or whether the technology can effectively address information security's many potential use cases. A cynic would call out the proliferation of claims about using AI for what it is -- marketing hype. Even the use of the term "AI" is misleading.
Visa: Using AI To Separate The Good, Bad From Transactions PYMNTS.com
That's the payments volume running over Visa's global network, a network whose vast global expanse is a tempting playground for cyberthieves. Visa's cybersecurity team, as Chief Information Security Officer Sunil Seshadri told Karen Webster, also logs as many as 8 billion security events every day -- that's billion with a "b." Not all events are intrusions or even attempts, but also include routine security logs and regular everyday network activity. These logs provide deep insight into what is happening in Visa's infrastructure and network on a real-time basis. But finding the signal in this noisy data is a challenge.
Diversifying Database Activity Monitoring with Bandits
Grushka-Cohen, Hagit, Biller, Ofer, Sofer, Oded, Rokach, Lior, Shapira, Bracha
Database activity monitoring (DAM) systems are commonly used by organizations to protect the organizational data, knowledge and intellectual properties. In order to protect organizations database DAM systems have two main roles, monitoring (documenting activity) and alerting to anomalous activity. Due to high-velocity streams and operating costs, such systems are restricted to examining only a sample of the activity. Current solutions use policies, manually crafted by experts, to decide which transactions to monitor and log. This limits the diversity of the data collected. Bandit algorithms, which use reward functions as the basis for optimization while adding diversity to the recommended set, have gained increased attention in recommendation systems for improving diversity. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit (MAB) problem and present a novel algorithm, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection tasks using a simulated dataset. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task and maximizing population coverage without decreasing the quality in terms of issuing alerts about events.