Goto

Collaborating Authors

 threat actor


Notepad Users, You May Have Been Hacked by China

WIRED

Suspected Chinese state-backed hackers hijacked the Notepadd++ update infrastructure to deliver a backdoored version of the popular free source code editor and note-taking app for Windows. Infrastructure delivering updates for Notepad++--a widely used text editor for Windows--was compromised for six months by suspected China-state hackers who used their control to deliver backdoored versions of the app to select targets, developers said Monday. "I deeply apologize to all users affected by this hijacking," the author of a post published to the official notepad-plus-plus.org The post said that the attack began last June with an "infrastructure-level compromise that allowed malicious actors to intercept and redirect update traffic destined for notepad-plus-plus.org." The attackers, whom multiple investigators tied to the Chinese government, then selectively redirected certain targeted users to malicious update servers where they received backdoored updates.



AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

Alam, Md Tanvirul, Bhusal, Dipkamal, Ahmad, Salman, Rastogi, Nidhi, Worth, Peter

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated strong capabilities in natural language reasoning, yet their application to Cyber Threat Intelligence (CTI) remains limited. CTI analysis involves distilling large volumes of unstructured reports into actionable knowledge, a process where LLMs could substantially reduce analyst workload. CTIBench introduced a comprehensive benchmark for evaluating LLMs across multiple CTI tasks. In this work, we extend CTIBench by developing AthenaBench, an enhanced benchmark that includes an improved dataset creation pipeline, duplicate removal, refined evaluation metrics, and a new task focused on risk mitigation strategies. We evaluate twelve LLMs, including state-of-the-art proprietary models such as GPT-5 and Gemini-2.5 Pro, alongside seven open-source models from the LLaMA and Qwen families. While proprietary LLMs achieve stronger results overall, their performance remains subpar on reasoning-intensive tasks, such as threat actor attribution and risk mitigation, with open-source models trailing even further behind. These findings highlight fundamental limitations in the reasoning capabilities of current LLMs and underscore the need for models explicitly tailored to CTI workflows and automation.


Living Off the LLM: How LLMs Will Change Adversary Tactics

Oesch, Sean, Hutchins, Jack, Koch, Luke, Kurian, Kevin

arXiv.org Artificial Intelligence

Abstract---In living off the land attacks, malicious actors use legitimate tools and processes already present on a system to avoid detection. In this paper, we explore how the on-device LLMs of the future will become a security concern as threat actors integrate LLMs into their living off the land attack pipeline and ways the security community may mitigate this threat. LOTL involves malicious actors using legitimate tools and processes already present on a system, often referred to as living off the land binaries or LOLBins. These techniques allow threat actors to blend in with normal system activity, making their actions difficult to detect and potentially bypassing basic security measures. LOTL attacks leverage legitimate system tools like WMI and PowerShell that are typically allowlisted, making them difficult to detect and attribute since they leave no malware signatures. These attacks allow adversarie s extended dwell time to execute sophisticated operations, while the lack of malicious signatures enables repeated use of the same tactics and complicates both prevention and incident response.



FRAME : Comprehensive Risk Assessment Framework for Adversarial Machine Learning Threats

Shapira, Avishag, Shigol, Simon, Shabtai, Asaf

arXiv.org Artificial Intelligence

The widespread adoption of machine learning (ML) systems increased attention to their security and emergence of adversarial machine learning (AML) techniques that exploit fundamental vulnerabilities in ML systems, creating an urgent need for comprehensive risk assessment for ML-based systems. While traditional risk assessment frameworks evaluate conventional cybersecurity risks, they lack ability to address unique challenges posed by AML threats. Existing AML threat evaluation approaches focus primarily on technical attack robustness, overlooking crucial real-world factors like deployment environments, system dependencies, and attack feasibility. Attempts at comprehensive AML risk assessment have been limited to domain-specific solutions, preventing application across diverse systems. Addressing these limitations, we present FRAME, the first comprehensive and automated framework for assessing AML risks across diverse ML-based systems. FRAME includes a novel risk assessment method that quantifies AML risks by systematically evaluating three key dimensions: target system's deployment environment, characteristics of diverse AML techniques, and empirical insights from prior research. FRAME incorporates a feasibility scoring mechanism and LLM-based customization for system-specific assessments. Additionally, we developed a comprehensive structured dataset of AML attacks enabling context-aware risk assessment. From an engineering application perspective, FRAME delivers actionable results designed for direct use by system owners with only technical knowledge of their systems, without expertise in AML. We validated it across six diverse real-world applications. Our evaluation demonstrated exceptional accuracy and strong alignment with analysis by AML experts. FRAME enables organizations to prioritize AML risks, supporting secure AI deployment in real-world environments.


On Technique Identification and Threat-Actor Attribution using LLMs and Embedding Models

Guru, Kyla, Moss, Robert J., Kochenderfer, Mykel J.

arXiv.org Artificial Intelligence

Attribution of cyber-attacks remains a complex but critical challenge for cyber defenders. Currently, manual extraction of behavioral indicators from dense forensic documentation causes significant attribution delays, especially following major incidents at the international scale. This research evaluates large language models (LLMs) for cyber-attack attribution based on behavioral indicators extracted from forensic documentation. We test OpenAI's GPT-4 and text-embedding-3-large for identifying threat actors' tactics, techniques, and procedures (TTPs) by comparing LLM-generated TTPs against human-generated data from MITRE ATT&CK Groups. Our framework then identifies TTPs from text using vector embedding search and builds profiles to attribute new attacks for a machine learning model to learn. Key contributions include: (1) assessing off-the-shelf LLMs for TTP extraction and attribution, and (2) developing an end-to-end pipeline from raw CTI documents to threat-actor prediction. This research finds that standard LLMs generate TTP datasets with noise, resulting in a low similarity to human-generated datasets. However, the TTPs generated are similar in frequency to those within the existing MITRE datasets. Additionally, although these TTPs are different than human-generated datasets, our work demonstrates that they still prove useful for training a model that performs above baseline on attribution. Project code and files are contained here: https://github.com/kylag/ttp_attribution.


Fox News AI Newsletter: Meet the AI real estate agent making millions

FOX News

Kevin O'Leary joined The Brian Kilmeade Show to discuss working with Frank McCourt to buy TikTok and the dangers of DeepSeek. ALWAYS CLOSING: Artificial intelligence is taking the world by storm and the real estate industry is no exception. Israeli startup eSelf AI is making it possible for customers to get their questions answered whether it's 3:00 in the afternoon or 3:00 in the morning. ESSENTIAL AI: NEAR A.I. co-founder and CEO Illia Polosukhin says A.I. is starting to become a fundamental part of peoples' digital life on'The Claman Countdown.' VOICE RECOGNITION CONTROVERSY: When one says "racist" into an iPhone, the voice-to-text feature indeed initially typed "Trump" before quickly correcting it to "racist." MOVING TO A FAKE CITY: There is a futuristic city designed and built from the ground up in Japan to test the latest technologies.


China, Iran-based threat actors have found new ways to to use American AI models for covert influence: Report

FOX News

Threat actors, some likely based in China and Iran, are formulating new ways to hijack and utilize American artificial intelligence (AI) models for malicious intent, including covert influence operations, according to a new report from OpenAI. The February report includes two disruptions involving threat actors that appear to have originated from China. According to the report, these actors have used, or at least attempted to use, models built by OpenAI and Meta. In one example, OpenAI banned a ChatGPT account that generated comments critical of Chinese dissident Cai Xia. The comments were posted on social media by accounts that claimed to be people based in India and the U.S.


LLM Cyber Evaluations Don't Capture Real-World Risk

Lukošiūtė, Kamilė, Swanda, Adam

arXiv.org Artificial Intelligence

Large language models (LLMs) are demonstrating increasing prowess in cybersecurity applications, creating creating inherent risks alongside their potential for strengthening defenses. In this position paper, we argue that current efforts to evaluate risks posed by these capabilities are misaligned with the goal of understanding real-world impact. Evaluating LLM cybersecurity risk requires more than just measuring model capabilities -- it demands a comprehensive risk assessment that incorporates analysis of threat actor adoption behavior and potential for impact. We propose a risk assessment framework for LLM cyber capabilities and apply it to a case study of language models used as cybersecurity assistants. Our evaluation of frontier models reveals high compliance rates but moderate accuracy on realistic cyber assistance tasks. However, our framework suggests that this particular use case presents only moderate risk due to limited operational advantages and impact potential. Based on these findings, we recommend several improvements to align research priorities with real-world impact assessment, including closer academia-industry collaboration, more realistic modeling of attacker behavior, and inclusion of economic metrics in evaluations. This work represents an important step toward more effective assessment and mitigation of LLM-enabled cybersecurity risks.