cti
Large Language Models for Explainable Threat Intelligence
Dinis, Tiago, Correia, Miguel, Tavares, Roger
As cyber threats continue to grow in complexity, traditional security mechanisms struggle to keep up. Large language models (LLMs) offer significant potential in cybersecurity due to their advanced capabilities in text processing and generation. This paper explores the use of LLMs with retrieval-augmented generation (RAG) to obtain threat intelligence by combining real-time information retrieval with domain-specific data. The proposed system, RAGRecon, uses a LLM with RAG to answer questions about cybersecurity threats. Moreover, it makes this form of Artificial Intelligence (AI) explainable by generating and visually presenting to the user a knowledge graph for every reply. This increases the transparency and interpretability of the reasoning of the model, allowing analysts to better understand the connections made by the system based on the context recovered by the RAG system. We evaluated RAGRecon experimentally with two datasets and seven different LLMs and the responses matched the reference responses more than 91% of the time for the best combinations.
- Workflow (0.68)
- Research Report (0.50)
- Overview (0.46)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation
Mitra, Shaswata, Bazarov, Azim, Duclos, Martin, Mittal, Sudip, Piplai, Aritran, Rahman, Md Rayhanur, Zieglar, Edward, Rahimi, Shahram
Signature-based Intrusion Detection Systems (IDS) detect malicious activities by matching network or host activity against predefined rules. These rules are derived from extensive Cyber Threat Intelligence (CTI), which includes attack signatures and behavioral patterns obtained through automated tools and manual threat analysis, such as sandboxing. The CTI is then transformed into actionable rules for the IDS engine, enabling real-time detection and prevention. However, the constant evolution of cyber threats necessitates frequent rule updates, which delay deployment time and weaken overall security readiness. Recent advancements in agentic systems powered by Large Language Models (LLMs) offer the potential for autonomous IDS rule generation with internal evaluation. We introduce FALCON, an autonomous agentic framework that generates deployable IDS rules from CTI data in real-time and evaluates them using built-in multi-phased validators. To demonstrate versatility, we target both network (Snort) and host-based (YARA) mediums and construct a comprehensive dataset of IDS rules with their corresponding CTIs. Our evaluations indicate FALCON excels in automatic rule generation, with an average of 95% accuracy validated by qualitative evaluation with 84% inter-rater agreement among multiple cybersecurity analysts across all metrics. These results underscore the feasibility and effectiveness of LLM-driven data mining for real-time cyber threat mitigation.
- North America > United States > Texas (0.04)
- North America > United States > Mississippi (0.04)
- North America > United States > Alabama (0.04)
- Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.50)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Cross-Technology Interference: Detection, Avoidance, and Coexistence Mechanisms in the ISM Bands
Kidane, Zegeye Mekasha, Dargie, Waltenegus
A large number of heterogeneous wireless networks share the unlicensed spectrum designated as the ISM (Industry, Scientific, and Medicine) radio band. These networks do not adhere to a common medium access rule and differ in their specifications considerably. As a result, when concurrently active, they cause cross-technology interference (CTI) on each other. The effect of this interference is not reciprocal, the networks using high transmission power and advanced transmission schemes often causing disproportionate disruptions to those with modest communication and computation resources. CTI corrupts packets, incurs packet retransmission cost, introduces end-to-end latency and jitter, and make networks unpredictable. The purpose of this paper is to closely examine its impact on low-power networks which are based on the IEEE 802.15.4 standard. It discusses latest developments on CTI detection, coexistence and avoidance mechanisms as well on messaging schemes which attempt to enable heterogeneous networks directly communicate with one another to coordinate packet transmission and channel assignment.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy (0.04)
- Europe > Germany > Rhineland-Palatinate (0.04)
- Telecommunications > Networks (0.68)
- Information Technology > Security & Privacy (0.46)
Automatic Mapping of Unstructured Cyber Threat Intelligence: An Experimental Study
Orbinato, Vittorio, Barbaraci, Mariarosaria, Natella, Roberto, Cotroneo, Domenico
Proactive approaches to security, such as adversary emulation, leverage information about threat actors and their techniques (Cyber Threat Intelligence, CTI). However, most CTI still comes in unstructured forms (i.e., natural language), such as incident reports and leaked documents. To support proactive security efforts, we present an experimental study on the automatic classification of unstructured CTI into attack techniques using machine learning (ML). We contribute with two new datasets for CTI analysis, and we evaluate several ML models, including both traditional and deep learning-based ones. We present several lessons learned about how ML can perform at this task, which classifiers perform best and under which conditions, which are the main causes of classification errors, and the challenges ahead for CTI analysis.
- Asia > Japan (0.04)
- North America > Panama (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
A Coupling Approach to Analyzing Games with Dynamic Environments
Collins, Brandon C., Xu, Shouhuai, Brown, Philip N.
The theory of learning in games has extensively studied situations where agents respond dynamically to each other by optimizing a fixed utility function. However, in real situations, the strategic environment varies as a result of past agent choices. Unfortunately, the analysis techniques that enabled a rich characterization of the emergent behavior in static environment games fail to cope with dynamic environment games. To address this, we develop a general framework using probabilistic couplings to extend the analysis of static environment games to dynamic ones. Using this approach, we obtain sufficient conditions under which traditional characterizations of Nash equilibria with best response dynamics and stochastic stability with log-linear learning can be extended to dynamic environment games. As a case study, we pose a model of cyber threat intelligence sharing between firms and a simple dynamic game-theoretic model of social precautions in an epidemic, both of which feature dynamic environments. For both examples, we obtain conditions under which the emergent behavior is characterized in the dynamic game by performing the traditional analysis on a reference static environment game.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.05)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey
Rahman, Md Rayhanur, Mahdavi-Hezaveh, Rezvan, Williams, Laurie
Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.
- North America > United States > California > Santa Clara County > Palo Alto (0.14)
- North America > United States > Utah (0.04)
- North America > United States > Virginia (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding
Seo, Seunghyun, Kwak, Donghyun, Lee, Bowon
Most End-to-End (E2E) SLU networks leverage the pre-trained ASR networks but still lack the capability to understand the semantics of utterances, crucial for the SLU task. To solve this, recently proposed studies use pre-trained NLU networks. However, it is not trivial to fully utilize both pre-trained networks; many solutions were proposed, such as Knowledge Distillation, cross-modal shared embedding, and network integration with Interface. We propose a simple and robust integration method for the E2E SLU network with novel Interface, Continuous Token Interface (CTI), the junctional representation of the ASR and NLU networks when both networks are pre-trained with the same vocabulary. Because the only difference is the noise level, we directly feed the ASR network's output to the NLU network. Thus, we can train our SLU network in an E2E manner without additional modules, such as Gumbel-Softmax. We evaluate our model using SLURP, a challenging SLU dataset and achieve state-of-the-art scores on both intent classification and slot filling tasks. We also verify the NLU network, pre-trained with Masked Language Model, can utilize a noisy textual representation of CTI. Moreover, we show our model can be trained with multi-task learning from heterogeneous data even after integration with CTI.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Pennsylvania (0.04)
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Ranade, Priyanka, Piplai, Aritran, Mittal, Sudip, Joshi, Anupam, Finin, Tim
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing the model to learn incorrect inputs to serve their malicious needs. In this paper, we automatically generate fake CTI text descriptions using transformers. We show that given an initial prompt sentence, a public language model like GPT-2 with fine-tuning, can generate plausible CTI text with the ability of corrupting cyber-defense systems. We utilize the generated fake CTI text to perform a data poisoning attack on a Cybersecurity Knowledge Graph (CKG) and a cybersecurity corpus. The poisoning attack introduced adverse impacts such as returning incorrect reasoning outputs, representation poisoning, and corruption of other dependent AI-based cyber defense systems. We evaluate with traditional approaches and conduct a human evaluation study with cybersecurity professionals and threat hunters. Based on the study, professional threat hunters were equally likely to consider our fake generated CTI as true.
- North America > United States > North Carolina > New Hanover County > Wilmington (0.04)
- North America > United States > Maryland > Baltimore County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (3 more...)
- Research Report (1.00)
- Overview (0.68)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)