Generative AI
Crimson Desert developer apologizes and promises to replace AI-generated art
Pearl Abyss, the game's developer, issued a lengthy apology on X and detailed its corrective actions. The developer behind the open-world RPG Crimson Desert has issued an official apology after players discovered several instances of AI-generated art in the game. Pearl Abyss posted on X that it released the game with some 2D visual props that were made with experimental AI generative tools and forgot to replace them before launch. We would like to address questions regarding the use of AI in Crimson Desert. During development, some 2D visual props were created as part of early-stage iteration using experimental AI generative tools.
Query-Based Adversarial Prompt Generation
Recent work has shown it is possible to construct adversarial examples that cause aligned language models to emit harmful strings or perform harmful behavior.Existing attacks work either in the white-box setting (with full access to the model weights), or through: the phenomenon that adversarial examples crafted on one model often remain effective on other models.We improve on prior work with a attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.We validate our attack on GPT-3.5 and OpenAI's safety classifier; we can cause GPT-3.5 to emit harmful strings that current transfer attacks fail at, and we can evade the OpenAI and Llama Guard safety classifiers with nearly 100% probability.
Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability
Co-crystallization is an accessible way to control physicochemical characteristics of organic crystals, which finds many biomedical applications. In this work, we present Generative Method for Co-crystal Design (GEMCODE), a novel pipeline for automated co-crystal screening based on the hybridization of deep generative models and evolutionary optimization for broader exploration of the target chemical space. GEMCODE enables fast co-crystal design with target tabletability profiles, which is crucial for the development of pharmaceuticals. With a series of experimental studies highlighting validation and discovery cases, we show that GEMCODE is effective even under realistic computational constraints. Furthermore, we explore the potential of language models in generating co-crystals. Finally, we present numerous previously unknown co-crystals predicted by GEMCODE and discuss its potential in accelerating drug development.
Pricing and Competition for Generative AI
Compared to classical machine learning (ML) models, generative models offer a new usage paradigm where (i) a single model can be used for many different tasks out-of-the-box; (ii) users interact with this model over a series of natural language prompts; and (iii) the model is ideally evaluated on binary user satisfaction with respect to model outputs. Given these characteristics, we explore the problem of how developers of new generative AI software can release and price their technology. We first develop a comparison of two different models for a specific task with respect to user cost-effectiveness. We then model the pricing problem of generative AI software as a game between two different companies who sequentially release their models before users choose their preferred model for each task. Here, the price optimization problem becomes piecewise continuous where the companies must choose a subset of the tasks on which to be cost-effective and forgo revenue for the remaining tasks. In particular, we reveal the value of market information by showing that a company who deploys later after knowing their competitor's price can always secure cost-effectiveness on at least one task, whereas the company who is the first-to-market must price their model in a way that incentivizes higher prices from the latecomer in order to gain revenue. Most importantly, we find that if the different tasks are sufficiently similar, the first-to-market model may become cost-ineffective on all tasks regardless of how this technology is priced.
Secret Collusion among AI Agents: Multi-Agent Deception via Steganography
Recent advancements in generative AI suggest the potential for large-scale interaction between autonomous agents and humans across platforms such as the internet. While such interactions could foster productive cooperation, the ability of AI agents to circumvent security oversight raises critical multi-agent security problems, particularly in the form of unintended information sharing or undesirable coordination. In our work, we establish the subfield of secret collusion, a form of multi-agent deception, in which two or more agents employ steganographic methods to conceal the true nature of their interactions, be it communicative or otherwise, from oversight. We propose a formal threat model for AI agents communicating steganographically and derive rigorous theoretical insights about the capacity and incentives of large language models (LLMs) to perform secret collusion, in addition to the limitations of threat mitigation measures. We complement our findings with empirical evaluations demonstrating rising steganographic capabilities in frontier single and multi-agent LLM setups and examining potential scenarios where collusion may emerge, revealing limitations in countermeasures such as monitoring, paraphrasing, and parameter optimization. Our work is the first to formalize and investigate secret collusion among frontier foundation models, identifying it as a critical area in AI Safety and outlining a comprehensive research agenda to mitigate future risks of collusion between generative AI systems.
ColJailBreak: Collaborative Generation and Editing for Jailbreaking Text-to-Image Deep Generation
DALL E) can produce high-quality images based on input language descriptions. These models incorporate a black-box safety filter to prevent the generation of unsafe or unethical content, such as violent, criminal, or hateful imagery. Recent jailbreaking methods generate adversarial prompts capable of bypassing safety filters and producing unsafe content, exposing vulnerabilities in influential commercial models. However, once these adversarial prompts are identified, the safety filter can be updated to prevent the generation of unsafe images. In this work, we propose an effective, simple, and difficult-to-detect jailbreaking solution: generating safe content initially with normal text prompts and then editing the generations to embed unsafe content.
Anthropic Denies It Could Sabotage AI Tools During War
The Department of Defense alleges the AI developer could manipulate models in the middle of war. Company executives argue that's impossible. Anthropic cannot manipulate its generative AI model Claude once the US military has it running, an executive wrote in a court filing on Friday. The statement was made in response to accusations from the Trump administration about the company potentially tampering with its AI tools during war . "Anthropic has never had the ability to cause Claude to stop working, alter its functionality, shut off access, or otherwise influence or imperil military operations," Thiyagu Ramasamy, Anthropic's head of public sector, wrote .
OpenAI is developing a unified AI 'superapp' for desktop users
OpenAI is developing a unified desktop superapp that will integrate ChatGPT, Codex, and Atlas into a single application, according to PCWorld's coverage of The Wall Street Journal report. This consolidation aims to reduce service fragmentation and improve overall quality for users accessing OpenAI's various AI tools. The superapp represents a significant shift toward streamlined AI services, potentially making OpenAI's offerings more accessible and efficient for desktop users. It seems you'll soon be able to access most of OpenAI's services in one place on your computer.
The Download: OpenAI is building a fully automated researcher, and a psychedelic trial blind spot
Plus: OpenAI is also creating a super app. OpenAI has a new grand challenge: building an AI researcher--a fully automated agent-based system capable of tackling large, complex problems by itself. The San Francisco firm said the new goal will be its "north star" for the next few years. By September, the company plans to build "an autonomous AI research intern" that can take on a small number of specific research problems. The intern will be the precursor to the fully automated multi-agent system, which is slated to debut in 2028. In an exclusive interview this week, OpenAI's chief scientist, Jakub Pachocki, talked me through the plans.
OpenAI is throwing everything into building a fully automated researcher
OpenAI is refocusing its research efforts and throwing its resources into a new grand challenge. The San Francisco firm has set its sights on building what it calls an AI researcher, a fully automated agent-based system that will be able to go off and tackle large, complex problems by itself. OpenAI says that this new research goal will be its "North Star" for the next few years, pulling together multiple research strands, including work on reasoning models, agents, and interpretability .