Goto

Collaborating Authors

 cybersecurity



2 Men Linked to China's Salt Typhoon Hacker Group Likely Trained in a Cisco 'Academy'

WIRED

The names of two partial owners of firms linked to the Salt Typhoon hacker group also appeared in records for a Cisco training program--years before the group targeted Cisco's devices in a spy campaign. Cisco's Networking Academy, a global training program designed to educate IT students in the basics of IT networks and cybersecurity, proudly touts its accessibility to participants around the world: "We believe education can be the ultimate equalizer, enabling anyone, regardless of background, to develop expertise and shape their destiny in a digital era," reads the first line on its website. That laudable statement, however, reads a bit differently when the "destiny" of those students appears to be owning a majority stake in companies linked to one of the most successful Chinese state-sponsored hacking operations ever to target the West--and many of Cisco's own products . That's the surprising conclusion of Dakota Cary, a researcher at cybersecurity firm SentinelOne and the Atlantic Council, who, like many security analysts, has closely tracked the Chinese state-sponsored hacker group known as Salt Typhoon . That cyberespionage group gained notoriety last year when it was revealed that the hackers had penetrated at least nine telecom companies and gained the ability to spy on Americans' real-time calls and texts, specifically targeting then-presidential and vice presidential candidates Donald Trump and JD Vance, among many others.


The Road of Adaptive AI for Precision in Cybersecurity

Garg, Sahil

arXiv.org Artificial Intelligence

Cybersecurity's evolving complexity presents unique challenges and opportunities for AI research and practice. This paper shares key lessons and insights from designing, building, and operating production-grade GenAI pipelines in cyberse-curity, with a focus on the continual adaptation required to keep pace with ever-shifting knowledge bases, tooling, and threats. Our goal is to provide an actionable perspective for AI practitioners and industry stakeholders navigating the frontier of GenAI for cybersecurity, with particular attention to how different adaptation mechanisms complement each other in end-to-end systems. We present practical guidance derived from real-world deployments, propose best practices for leveraging retrieval-and model-level adaptation, and highlight open research directions for making GenAI more robust, precise, and auditable in cyber defense. Disclaimer: The ideas and analysis presented here are subjective. We share them based on our experience of establishing robust and efficient pipelines of generative AI for cybersecurity. In light of the age of generative AI, the objective of this document is not to provide generic descriptions of GenAI techniques, but rather to explain their practical relevance for specific contexts, and to illustrate where particular choices have worked well or poorly in our own deployments.


AgenticCyber: A GenAI-Powered Multi-Agent System for Multimodal Threat Detection and Adaptive Response in Cybersecurity

Roy, Shovan

arXiv.org Artificial Intelligence

The increasing complexity of cyber threats in distributed environments demands advanced frameworks for real-time detection and response across multimodal data streams. This paper introduces AgenticCyber, a generative AI powered multi-agent system that orchestrates specialized agents to monitor cloud logs, surveillance videos, and environmental audio concurrently. The solution achieves 96.2% F1-score in threat detection, reduces response latency to 420 ms, and enables adaptive security posture management using multimodal language models like Google's Gemini coupled with LangChain for agent orchestration. Benchmark datasets, such as AWS CloudTrail logs, UCF-Crime video frames, and UrbanSound8K audio clips, show greater performance over standard intrusion detection systems, reducing mean time to respond (MTTR) by 65% and improving situational awareness. This work introduces a scalable, modular proactive cybersecurity architecture for enterprise networks and IoT ecosystems that overcomes siloed security technologies with cross-modal reasoning and automated remediation.


Frontier AI's Impact on the Cybersecurity Landscape

Potter, Yujin, Guo, Wenbo, Wang, Zhun, Shi, Tianneng, Li, Hongwei, Zhang, Andy, Kelley, Patrick Gage, Thomas, Kurt, Song, Dawn

arXiv.org Artificial Intelligence

The impact of frontier AI (i.e., AI agents and foundation models) in cybersecurity is rapidly increasing. In this paper, we comprehensively analyze this trend through multiple aspects: quantitative benchmarks, qualitative literature review, empirical evaluation, and expert survey. Our analyses consistently show that AI's capabilities and applications in attacks have exceeded those on the defensive side. Our empirical evaluation of widely used agent systems on cybersecurity benchmarks highlights that current AI agents struggle with flexible workflow planning and using domain-specific tools for complex security analysis -- capabilities particularly critical for defensive applications. Our expert survey of AI and security researchers and practitioners indicates a prevailing view that AI will continue to benefit attackers over defenders, though the gap is expected to narrow over time. These results show the urgent need to evaluate and mitigate frontier AI's risks, steering it towards benefiting cyber defenses. Responding to this need, we provide concrete calls to action regarding: the construction of new cybersecurity benchmarks, the development of AI agents for defense, the design of provably secure AI agents, the improvement of pre-deployment security testing and transparency, and the strengthening of user-oriented education and defenses. Our paper summary and blog are available at https://rdi.berkeley.edu/frontier-ai-impact-on-cybersecurity/.


A Major Leak Spills a Chinese Hacking Contractor's Tools and Targets

WIRED

Plus: State-sponsored AI hacking is here, Google hosts a CBP face recognition app, and more of the week's top security news. The United States issued a seizure warrant to Starlink this week related to satellite internet infrastructure used in a scam compound in Myanmar. The action is part of a larger US law enforcement interagency initiative announced this week called the District of Columbia Scam Center Strike Force. Meanwhile, Google moved this week to sue 25 people that it alleges are behind a "staggering" and "relentless" scam text operation that uses a notorious phishing-as-a-service platform called Lighthouse. WIRED reported this week that the US Department of Homeland Security collected data on Chicago residents accused of gang ties to test if police files could feed an FBI watchlist--and then, crucially, kept the records for months in violation of domestic espionage rules .


Reimagining cybersecurity in the era of AI and quantum

MIT Technology Review

The threat landscape is being shaped by two seismic forces. To future-proof their organizations, security leaders must take a proactive stance with a zero trust approach. AI and quantum technologies are dramatically reconfiguring how cybersecurity functions, redefining the speed and scale with which digital defenders and their adversaries can operate. The weaponization of AI tools for cyberattacks is already proving a worthy opponent to current defenses. This includes using generative AI to create social engineering attacks at scale, churning out tens of thousands of tailored phishing emails in seconds, or accessing widely available voice cloning software capable of bypassing security defenses for as little as a few dollars. And now, agentic AI raises the stakes by introducing autonomous systems that can reason, act, and adapt like human adversaries.


Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

Borah, Arnabh, Alam, Md Tanvirul, Rastogi, Nidhi

arXiv.org Artificial Intelligence

Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.


Toward Cybersecurity-Expert Small Language Models

Levi, Matan, Ohayon, Daniel, Blobstein, Ariel, Sagi, Ravid, Molloy, Ian, Allouche, Yair

arXiv.org Artificial Intelligence

Large language models (LLMs) are transforming everyday applications, yet deployment in cybersecurity lags due to a lack of high-quality, domain-specific models and training datasets. To address this gap, we present CyberPal 2.0, a family of cybersecurity-expert small language models (SLMs) ranging from 4B-20B parameters. To train CyberPal 2.0, we generate an enriched chain-of-thought cybersecurity instruction dataset built with our data enrichment and formatting pipeline, SecKnowledge 2.0, which integrates expert-in-the-loop steering of reasoning formats alongside LLM-driven multi-step grounding, yielding higher-fidelity, task-grounded reasoning traces for security tasks. Across diverse cybersecurity benchmarks, CyberPal 2.0 consistently outperforms its baselines and matches or surpasses various open and closed-source frontier models, while remaining a fraction of their size. On core cyber threat intelligence knowledge tasks, our models outperform almost all tested frontier models, ranking second only to Sec-Gemini v1. On core threat-investigation tasks, such as correlating vulnerabilities and bug tickets with weaknesses, our best 20B-parameter model outperforms GPT-4o, o1, o3-mini, and Sec-Gemini v1, ranking first, while our smallest 4B-parameter model ranks second.


SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence

Aghaei, Ehsan, Jain, Sarthak, Arun, Prashanth, Sambamoorthy, Arjun

arXiv.org Artificial Intelligence

Effective analysis of cybersecurity and threat intelligence data demands language models that can interpret specialized terminology, complex document structures, and the interdependence of natural language and source code. Encoder-only transformer architectures provide efficient and robust representations that support critical tasks such as semantic search, technical entity extraction, and semantic analysis, which are key to automated threat detection, incident triage, and vulnerability assessment. However, general-purpose language models often lack the domain-specific adaptation required for high precision. We present SecureBERT 2.0, an enhanced encoder-only language model purpose-built for cybersecurity applications. Leveraging the ModernBERT architecture, SecureBERT 2.0 introduces improved long-context modeling and hierarchical encoding, enabling effective processing of extended and heterogeneous documents, including threat reports and source code artifacts. Pretrained on a domain-specific corpus more than thirteen times larger than its predecessor, comprising over 13 billion text tokens and 53 million code tokens from diverse real-world sources, SecureBERT 2.0 achieves state-of-the-art performance on multiple cybersecurity benchmarks. Experimental results demonstrate substantial improvements in semantic search for threat intelligence, semantic analysis, cybersecurity-specific named entity recognition, and automated vulnerability detection in code within the cybersecurity domain.