Generative AI
The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies
Coggins, Sam, Saeri, Alexander K., Daniell, Katherine A., Ruster, Lorenn P., Liu, Jessie, Davis, Jenny L.
The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies. Abstract Prominent AI companies are producing'safety frameworks' as a type of voluntary self-governance. These statements purport to establish risk thresholds and safety procedures for the development and deployment of highly capable AI. Understanding which AI risks are covered and what actions are allowed, refused, demanded, encouraged, or discouraged by these statements is vital for assessing how these frameworks actually govern AI development and deployment. We draw on affordance theory to analyse the OpenAI'Preparedness Framework Version 2' (April 2025) using the Mechanisms & Conditions model of affordances and the MIT AI Risk Repository. We find that this safety policy requests evaluation of a small minority of AI risks, encourages deployment of systems with'Medium' capabilities for unintentionally enabling'severe harm' (which OpenAI defines as >1000 deaths or >$100B in damages), and allows OpenAI's CEO to deploy even more dangerous capabilities. These findings suggest that effective mitigation of AI risks requires more robust governance interventions beyond current industry self-regulation. Our affordance analysis provides a replicable method for evaluating what safety frameworks actually permit versus what they claim.
Safety-Aligned Weights Are Not Enough: Refusal-Teacher-Guided Finetuning Enhances Safety and Downstream Performance under Harmful Finetuning Attacks
Ham, Seokil, Choi, Yubin, Yang, Yujin, Cho, Seungju, Kim, Younghun, Kim, Changick
Recently, major AI providers such as Google and OpenAI have introduced Finetuning-as-a-Service (FaaS), which allows users to customize Large Language Models (LLMs) using their own data. However, this service is vulnerable to safety degradation when user data includes harmful prompts, a threat known as harmful finetuning attacks. Prior works attempt to mitigate this issue by first constructing safety-aligned model and then finetuning the model on user data. However, we observe that the safety-aligned weights provide weak initialization for downstream task learning, leading to suboptimal safety-alignment and downstream task performance. To address this, we propose a Refusal-T eacher (Ref-T eacher)-guided finetuning framework. Instead of finetuning a safety-aligned model on user data, our approach directly finetunes the base model under the guidance of a safety-aligned Ref-Teacher, which filters harmful prompts from user data and distills safety-alignment knowledge into the base model. Extensive experiments demonstrate that our Ref-Teacher-guided finetuning strategy effectively minimizes harmful outputs and enhances finetuning accuracy for user-specific tasks, offering a practical solution for secure and reliable deployment of LLMs in FaaS. Recent advancements in Large Language Models (LLMs) (Touvron et al. (2023); Jiang et al. (2023); Team et al. (2024); Team (2024); Hurst et al. (2024); Guo et al. (2025); Research et al. (2025)) have achieved remarkable performance across a wide range of natural language processing tasks. LLMs are typically pretrained on massive and diverse corpora, resulting in strong generalization ability and broad applicability across domains. To further facilitate LLMs for individual and domain-specific purposes, major AI service providers such as Google and OpenAI offer not only access to pretrained LLMs but also Finetuning-as-a-Service (FaaS). This service enables users to upload custom datasets and adapt LLMs to more specific tasks and domains depending on their unique requirements. However, FaaS must prevent the malicious use of LLMs through safety-alignment, even when users attempt to jailbreak the models via customization. These types of attacks, which inject harmful prompts into user data for finetuning, are called harmful finetuning attacks. Several studies (Qi et al. (2023); Lermen et al. (2023); Rosati et al. (2024); Huang et al. (2024b;c;d); Li et al. (2025); Huang et al. (2025)) have shown that finetuning on user data containing harmful content compromises the safety-alignment, despite the LLMs being safety-aligned before finetuning.
SynthID-Image: Image watermarking at internet scale
Gowal, Sven, Bunel, Rudy, Stimberg, Florian, Stutz, David, Ortiz-Jimenez, Guillermo, Kouridi, Christina, Vecerik, Mel, Hayes, Jamie, Rebuffi, Sylvestre-Alvise, Bernard, Paul, Gamble, Chris, Horváth, Miklós Z., Kaczmarczyck, Fabian, Kaskasoli, Alex, Petrov, Aleksandar, Shumailov, Ilia, Thotakuri, Meghana, Wiles, Olivia, Yung, Jessica, Ahmed, Zahra, Martin, Victor, Rosen, Simon, Savčak, Christopher, Senoner, Armin, Vyas, Nidhi, Kohli, Pushmeet
We introduce SynthID-Image, a deep learning-based system for invisibly watermarking AI-generated imagery. This paper documents the technical desiderata, threat models, and practical challenges of deploying such a system at internet scale, addressing key requirements of effectiveness, fidelity, robustness, and security. SynthID-Image has been used to watermark over ten billion images and video frames across Google's services and its corresponding verification service is available to trusted testers. For completeness, we present an experimental evaluation of an external model variant, SynthID-O, which is available through partnerships. We benchmark SynthID-O against other post-hoc watermarking methods from the literature, demonstrating state-of-the-art performance in both visual quality and robustness to common image perturbations. While this work centers on visual media, the conclusions on deployment, constraints, and threat modeling generalize to other modalities, including audio. This paper provides a comprehensive documentation for the large-scale deployment of deep learning-based media provenance systems.
MCMC: Bridging Rendering, Optimization and Generative AI
Generative artificial intelligence (AI) has made unprecedented advances in vision language models over the past two years. During the generative process, new samples (images) are generated from an unknown high-dimensional distribution. Markov Chain Monte Carlo (MCMC) methods are particularly effective in drawing samples from such complex, high-dimensional distributions. This makes MCMC methods an integral component for models like EBMs, ensuring accurate sample generation. Gradient-based optimization is at the core of modern generative models. The update step during the optimization forms a Markov chain where the new update depends only on the current state. This allows exploration of the parameter space in a memoryless manner, thus combining the benefits of gradient-based optimization and MCMC sampling. MCMC methods have shown an equally important role in physically based rendering where complex light paths are otherwise quite challenging to sample from simple importance sampling techniques. A lot of research is dedicated towards bringing physical realism to samples (images) generated from diffusion-based generative models in a data-driven manner, however, a unified framework connecting these techniques is still missing. In this course, we take the first steps toward understanding each of these components and exploring how MCMC could potentially serve as a bridge, linking these closely related areas of research. Our course aims to provide necessary theoretical and practical tools to guide students, researchers and practitioners towards the common goal of generative physically based rendering. All Jupyter notebooks with demonstrations associated to this tutorial can be found on the project webpage: https://sinbag.github.io/mcmc/
Evolutionary Computation as Natural Generative AI
Shi, Yaxin, Gupta, Abhishek, Wu, Ying, Wong, Melvin, Tsang, Ivor, Rios, Thiago, Menzel, Stefan, Sendhoff, Bernhard, Hou, Yaqing, Ong, Yew-Soon
Generative AI (GenAI) has achieved remarkable success across a range of domains, but its capabilities remain constrained to statistical models of finite training sets and learning based on local gradient signals. This often results in artifacts that are more derivative than genuinely generative. In contrast, Evolutionary Computation (EC) offers a search-driven pathway to greater diversity and creativity, expanding generative capabilities by exploring uncharted solution spaces beyond the limits of available data. This work establishes a fundamental connection between EC and GenAI, redefining EC as Natural Generative AI (NatGenAI) -- a generative paradigm governed by exploratory search under natural selection. We demonstrate that classical EC with parent-centric operators mirrors conventional GenAI, while disruptive operators enable structured evolutionary leaps, often within just a few generations, to generate out-of-distribution artifacts. Moreover, the methods of evolutionary multitasking provide an unparalleled means of integrating disruptive EC (with cross-domain recombination of evolved features) and moderated selection mechanisms (allowing novel solutions to survive), thereby fostering sustained innovation. By reframing EC as NatGenAI, we emphasize structured disruption and selection pressure moderation as essential drivers of creativity. This perspective extends the generative paradigm beyond conventional boundaries and positions EC as crucial to advancing exploratory design, innovation, scientific discovery, and open-ended generation in the GenAI era.
'It's going to be really bad': Fears over AI bubble bursting grow in Silicon Valley
'It's going to be really bad': Fears over AI bubble bursting grow in Silicon Valley At OpenAI's DevDay this week, OpenAI boss Sam Altman did what American tech bosses rarely do these days: he actually answered questions from reporters. I know it's tempting to write the bubble story, Mr Altman told me as he sat flanked by his top lieutenants. In fact, there are many parts of AI that I think are kind of bubbly right now. In Silicon Valley, the debate over whether AI companies are overvalued has taken on a new urgency. Sceptics are privately - and some now publicly - asking whether the rapid rise in the value of AI tech companies may be, at least in part, the result of what they call financial engineering.
WIRED Roundup: Are We In An AI Bubble?
WIRED Roundup: Are We In an AI Bubble? In this episode of, we talk about what you need to know this week, from one Antifa author's journey to flee the US to a recent Open AI announcement that rippled across the market. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. In today's episode, Zoë Schiffer is joined by senior politics editor Leah Feiger to run through five stories that you need to know about this week--from the Antifa professor who's fleeing to Europe for safety, to how some chatbots are manipulating users to avoid saying goodbye. Then, Zoë and Leah break down why a recent announcement from OpenAI rattled the markets and answer the question everyone is wondering--are we in an AI bubble? He Wrote a Book About Antifa. Write to us at uncannyvalley@wired.com . You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link . Today on the show, we're bringing you five stories that you need to know about this week, including why a seemingly minor announcement from OpenAI ended up rippling across several companies and what it says about the current state of the technology industry. I'm joined today by our senior politics editor, Leah Feiger.
Reports of the Association for the Advancement of Artificial Intelligence's 2025 Spring Symposium Series
The Association for the Advancement of Artificial Intelligence's 2025 Spring Symposium Series was held in Burmingame, California, March 31-April 2, 2025. There were eight symposia in the spring program: AI for Engineering and Scientific Discoveries, AI for Health Symposium: Leveraging Artificial Intelligence to Revolutionize Healthcare, Current and Future Varieties of Human-AI Collaboration, GenAI@Edge: Empowering Generative AI at the Edge, Human-Compatible AI for Well-being: Harnessing Potential of GenAI for AI-Powered Science, Machine Learning and Knowledge Engineering for Trustworthy Multimodal and Generative AI, Symposium on Child-AI Interaction in the Era of Foundation Models, Towards Agentic AI for Science: Hypothesis Generation, Comprehension, Quantification, and Validation. This report contains summaries of the workshops, which were submitted by some, but not all, of the workshop chairs. This symposium aims to advance and diversify the application of AI in emerging engineering and scientific discovery domains. Inspired by progress in large language models, generative AI, and AI-assisted scientific computing, we seek to foster new collaborations between industry and academia to tackle challenging problems in materials, manufacturing, and life sciences. We also plan to explore new directions in human-machine interaction for accelerating knowledge discovery and address related ethical considerations. Through invited speakers, panel discussions, and contributions from researchers with cross-disciplinary expertise, we hoped to cultivate partnerships that drive transformative advances in both AI and scientific research. No formal report was filed by the organizers for this symposium.