Generative AI
STACK: Adversarial Attacks on LLM Safeguard Pipelines
McKenzie, Ian R., Hollinsworth, Oskar J., Tseng, Tom, Davies, Xander, Casper, Stephen, Tucker, Aaron D., Kirk, Robert, Gleave, Adam
Frontier AI developers are relying on layers of safeguards to protect against catastrophic misuse of AI systems. Anthropic guards their latest Claude 4 Opus model using one such defense pipeline, and other frontier developers including Google DeepMind and OpenAI pledge to soon deploy similar defenses. However, the security of such pipelines is unclear, with limited prior work evaluating or attacking these pipelines. We address this gap by developing and red-teaming an open-source defense pipeline. First, we find that a novel few-shot-prompted input and output classifier outperforms state-of-the-art open-weight safeguard model ShieldGemma across three attacks and two datasets, reducing the attack success rate (ASR) to 0% on the catastrophic misuse dataset ClearHarm. Second, we introduce a STaged AttaCK (STACK) procedure that achieves 71% ASR on ClearHarm in a black-box attack against the few-shot-prompted classifier pipeline. Finally, we also evaluate STACK in a transfer setting, achieving 33% ASR, providing initial evidence that it is feasible to design attacks with no access to the target pipeline. We conclude by suggesting specific mitigations that developers could use to thwart staged attacks.
Meta Swears This Time Is Different
Mark Zuckerberg was supposed to win the AI race. Eons before ChatGPT and AlphaGo, when OpenAI did not exist and Google had not yet purchased DeepMind, there was FAIR: Facebook AI Research. In 2013, Facebook tapped one of the "godfathers" of AI, the legendary computer scientist Yann LeCun, to lead its new division. That year, Zuckerberg personally traveled to one of the world's most prestigious AI conferences to announce FAIR and recruit top scientists to the lab. FAIR has since made a number of significant contributions to AI research, including in the field of computer vision.
OpenAI might start watermarking images generated by ChatGPT
Android Authority has been digging around in the files of the latest ChatGPT app (beta version 1.2025.196) When generating an image with ChatGPT, you will soon be able to select "Save without watermark" in the menu behind the three dots in the top-right corner of the app. Obviously, this feature would be rather useless if images weren't going to be watermarked. Will all users be able to save images without watermarks? Android Authority speculates that the feature may sit behind a paywall and only be available to paid ChatGPT subscribers.
Mizuho partners with SoftBank on AI to boost efficiency
Mizuho Financial Group said Friday that it has signed a strategic partnership agreement with SoftBank to introduce cutting-edge artificial intelligence to streamline operations and improve customer service. Mizuho will be the first in the financial sector to introduce "Cristal intelligence," which is being developed jointly by SoftBank and OpenAI, the U.S. developer of the ChatGPT generative AI tool. Mizuho expects the latest AI technology, which optimizes corporate tasks, to help the company increase revenue and cut costs, resulting in positive effects totaling 300 billion by fiscal 2030. Using the technology, Mizuho plans to analyze transaction data and market trends to quickly provide corporate customers with management advice. The financial group also expects the technology to help boost productivity in its sales activities more than twofold and reduce low-value operations by up to 50%.
Netflix uses generative AI in one of its shows for first time
Netflix has used artificial intelligence in one of its TV shows for the first time, in a move the streaming company's boss said would make films and programmes cheaper and of better quality. Ted Sarandos, a co-chief executive of Netflix, said the Argentinian science fiction series El Eternauta (The Eternaut) was the first it had made that involved using generative AI footage. "We remain convinced that AI represents an incredible opportunity to help creators make films and series better, not just cheaper," he told analysts on Thursday after Netflix reported its second-quarter results. He said the series, which follows survivors of a rapid and devastating toxic snowfall, involved Netflix and visual effects (VFX) artists using AI to show a building collapsing in Buenos Aires. "Using AI-powered tools, they were able to achieve an amazing result with remarkable speed and, in fact, that VFX sequence was completed 10 times faster than it could have been completed with traditional VFX tools and workflows," he said.
Netflix boss says AI effects used in show for first time
Netflix says it has used visual effects created by generative artificial intelligence (AI) on screen for the first time in one of its original TV shows. The streaming giant's co-CEO Ted Sarandos said AI, which produces videos and images based on prompts, was used to create a scene of a building collapsing in the Argentine science fiction show, The Eternauts. He praised the technology as an "incredible opportunity to help creators make films and series better, not just cheaper." The use of generative AI is controversial in the entertainment industry and has sparked fears that it will replace the work of humans.
Fairness Is Not Enough: Auditing Competence and Intersectional Bias in AI-powered Resume Screening
The increasing use of generative AI for resume screening is predicated on the assumption that it offers an unbiased alternative to biased human decision-making. However, this belief fails to address a critical question: are these AI systems fundamentally competent at the evaluative tasks they are meant to perform? This study investigates the question of competence through a two-part audit of eight major AI platforms. Experiment 1 confirmed complex, contextual racial and gender biases, with some models penalizing candidates merely for the presence of demographic signals. Experiment 2, which evaluated core competence, provided a critical insight: some models that appeared unbiased were, in fact, incapable of performing a substantive evaluation, relying instead on superficial keyword matching. This paper introduces the "Illusion of Neutrality" to describe this phenomenon, where an apparent lack of bias is merely a symptom of a model's inability to make meaningful judgments. This study recommends that organizations and regulators adopt a dual-validation framework, auditing AI hiring tools for both demographic bias and demonstrable competence to ensure they are both equitable and effective.
Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application
Wang, Xiucheng, Jia, Honggang, Cheng, Nan
In this paper, a novel semantic communication framework empowered by generative artificial intelligence (GAI) is proposed, to enhance the robustness against both channel noise and transmission data distribution shifts. A theoretical foundation is established using stochastic differential equations (SDEs), from which a closed-form mapping between any signal-to-noise ratio (SNR) and the optimal denoising timestep is derived. Moreover, to address distribution mismatch, a mathematical scaling method is introduced to align received semantic features with the training distribution of the GAI. Built on this theoretical foundation, a latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction, where a pretrained diffusion model is used for denoising. The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions, offering a scalable and robust solution for future 6G semantic communication systems. Experimental results demonstrate that the proposed semantic communication framework achieves state-of-the-art performance in both pixel-level accuracy and semantic perceptual quality, consistently outperforming baselines across a wide range of SNRs and data distributions without any fine-tuning or post-training.
Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection
Zhao, Hongyang, Liang, Tianyu, Davari, Sina, Kim, Daeho
While recent advancements in deep neural networks (DNNs) have substantially enhanced visual AI's capabilities, the challenge of inadequate data diversity and volume remains, particularly in construction domain. This study presents a novel image synthesis methodology tailored for construction worker detection, leveraging the generative-AI platform Midjourney. The approach entails generating a collection of 12,000 synthetic images by formulating 3000 different prompts, with an emphasis on image realism and diversity. These images, after manual labeling, serve as a dataset for DNN training. Evaluation on a real construction image dataset yielded promising results, with the model attaining average precisions (APs) of 0.937 and 0.642 at intersection-over-union (IoU) thresholds of 0.5 and 0.5 to 0.95, respectively. Notably, the model demonstrated near-perfect performance on the synthetic dataset, achieving APs of 0.994 and 0.919 at the two mentioned thresholds. These findings reveal both the potential and weakness of generative AI in addressing DNN training data scarcity.
OpenAI launches personal assistant capable of controlling files and web browsers
Users of ChatGPT will be able to ask an AI agent to find restaurant reservations, go shopping for them and even draw up lists of candidates for job vacancies, as the chatbot gains the powers of a personal assistant from Thursday. ChatGPT agent, launched by Open AI everywhere apart from the EU, not only "thinks" but also acts, the US company said. The agent combines the powers of AI research tools with the ability to take control of web browsers, computer files and software such as spreadsheets and slide decks. It follows the launch of similar "agents" by Google and Anthropic as interest grows in AI models that can handle computer-based tasks by judging which software is best to use and toggling between systems to autonomously complete assignments like drafting travel itineraries or carrying out work research. "The hope is that agents are able to bring some real utility to users – to actually do things for them rather than just outputting polished text and sounding impressive," said Niamh Burns, senior media analyst at Enders Analysis.