Generative AI
Boost-and-Skip: A Simple Guidance-Free Diffusion for Minority Generation
Um, Soobin, Kim, Beomsu, Ye, Jong Chul
Minority samples are underrepresented instances located in low-density regions of a data manifold, and are valuable in many generative AI applications, such as data augmentation, creative content generation, etc. Unfortunately, existing diffusion-based minority generators often rely on computationally expensive guidance dedicated for minority generation. To address this, here we present a simple yet powerful guidance-free approach called Boost-and-Skip for generating minority samples using diffusion models. The key advantage of our framework requires only two minimal changes to standard generative processes: (i) variance-boosted initialization and (ii) timestep skipping. We highlight that these seemingly-trivial modifications are supported by solid theoretical and empirical evidence, thereby effectively promoting emergence of underrepresented minority features. Our comprehensive experiments demonstrate that Boost-and-Skip greatly enhances the capability of generating minority samples, even rivaling guidance-based state-of-the-art approaches while requiring significantly fewer computations.
Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning
In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training by a pretrained models. By this, we build on our previous work, where we tested the ability of evolution strategies - specifically the aforementioned OpenAI-ES - to train the Decision Transformer architecture. The results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting results. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.
Position: It's Time to Act on the Risk of Efficient Personalized Text Generation
Iofinova, Eugenia, Jovanovic, Andrej, Alistarh, Dan
The recent surge in high-quality open-sourced Generative AI text models (colloquially: LLMs), as well as efficient finetuning techniques, has opened the possibility of creating high-quality personalized models, i.e., models generating text attuned to a specific individual's needs and capable of credibly imitating their writing style by leveraging that person's own data to refine an open-source model. The technology to create such models is accessible to private individuals, and training and running such models can be done cheaply on consumer-grade hardware. These advancements are a huge gain for usability and privacy. This position paper argues, however, that these advancements also introduce new safety risks by making it practically feasible for malicious actors to impersonate specific individuals at scale, for instance for the purpose of phishing emails, based on small amounts of publicly available text. We further argue that these risks are complementary to - and distinct from - the much-discussed risks of other impersonation attacks such as image, voice, or video deepfakes, and are not adequately addressed by the larger research community, or the current generation of open - and closed-source models.
Free Agent in Agent-Based Mixture-of-Experts Generative AI Framework
Multi-agent systems commonly distribute tasks among specialized, autonomous agents, yet they often lack mechanisms to replace or reassign underperforming agents in real time. Inspired by the free-agency model of Major League Baseball, the Reinforcement Learning Free Agent (RLFA) algorithm introduces a reward-based mechanism to detect and remove agents exhibiting persistent underperformance and seamlessly insert more capable ones. Each agent internally uses a mixture-of-experts (MoE) approach, delegating incoming tasks to specialized sub-models under the guidance of a gating function. A primary use case is fraud detection, where RLFA promptly swaps out an agent whose detection accuracy dips below a preset threshold. A new agent is tested in a probationary mode, and upon demonstrating superior performance, fully replaces the underperformer. This dynamic, free-agency cycle ensures sustained accuracy, quicker adaptation to emerging threats, and minimal disruption to ongoing operations. By continually refreshing its roster of agents, the system fosters ongoing improvements and more resilient collaboration in multi-agent Generative AI environments.
Semi-supervised Learning with Deep Generative Models
Durk P. Kingma, Shakir Mohamed, Danilo Jimenez Rezende, Max Welling
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
How to use tasks and reminders inside ChatGPT
We've seen numerous new features added to ChatGPT in recent months, including updated models, web search capabilities, and the ability to remember what you say to it--and the latest software upgrade added to the AI bot by OpenAI makes it more useful as a general-purpose digital assistant. Beginning in beta form, and available initially to paying subscribers--the feature will reach everyone eventually, OpenAI says--ChatGPT Tasks lets you request the AI chatbot perform actions regularly on an automated schedule, or remind you about something in the future. Here's everything you need to know about it. "In this early beta, you can create scheduled tasks that enable ChatGPT to run automated prompts and proactively reach out to you on a scheduled basis," explains OpenAI. Tasks are available on the web, in the mobile apps, and in the macOS desktop app; OpenAI says the feature will make it to the Windows desktop app soon.
How Sam Altman sidestepped Elon Musk to win over Donald Trump
At President Donald Trump's inauguration, Sam Altman, CEO of OpenAI, was relegated to the overflow room while other tech billionaires like Elon Musk and Mark Zuckerberg took prime spots on the dais under the Capitol rotunda. But days earlier, before flying to Washington, Altman was on the phone with Trump, preparing an announcement that would outflank Musk and put Altman's company at the center of the new administration's agenda for artificial intelligence. On the 25-minute call, Altman appealed to Trump's love of a big story and of a big deal. He told the president-elect that the tech industry would achieve artificial general intelligence -- the hypothetical moment when technology matches human intelligence -- during the Trump administration, according to three people familiar with the call. And to get there before Chinese competitors, OpenAI, Oracle and SoftBank had completed a 100 billion deal to build data centers across the country.
PyPotteryInk: One-Step Diffusion Model for Sketch to Publication-ready Archaeological Drawings
Archaeological ceramics are a valuable source of information for reconstructing the customs, exchanges and social relationships of ancient populations, as well as for dating archaeological contexts (Sinopoli 1991; Peroni 1994; Steiner and Allason-Jones 2005; Vidale 2007; Orton and Hughes 2013; Hunt 2016). However, in order to turn a ceramic fragment into a rich source of scientific information, a long process of study and elaboration is required: once recovered in an excavation, the ceramic fragment is washed, catalogued, drawn and made ready for publication through the preparation of tables and figures that allow its correct interpretation and comparison with other archaeological contexts. Archaeological drawing is a fundamental and well-established tool in archaeological practice, and new technologies and methods are emerging to automate, standardise and speed up this process as much as possible. An example of this is the LAD (Laser Aided Profiler - Demjรกn, Pavรบk, and Roosevelt 2023), a tool that allows ceramic fragments to be'drawn' quickly and accurately using a laser beam. Over time, however, many drawings were made by hand using traditional tools such as pencils and then had to be'inked' and made ready for publication. Traditionally, this post-process was done by hand with Indian ink, and nowadays digital drawing programmes are used. This process is however extremely time-consuming and can often discourage the publication of new contexts due to the difficulties in terms of time and resources needed for inking. Generative AI can help to achieve this task, using complex image translation operation. Today, AI is permeating business, creativity and everyday life (Elliott 2019; Le et al. 2020; Varghese, Raj, and Venkatesh 2022; Azatbekova
Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
Huang, Jen-tse, Yan, Yuhang, Liu, Linqi, Wan, Yixin, Wang, Wenxuan, Chang, Kai-Wei, Lyu, Michael R.
The generation of incorrect images, such as depictions of people of color in Nazi-era uniforms by Gemini, frustrated users and harmed Google's reputation, motivating us to investigate the relationship between accurately reflecting factuality and promoting diversity and equity. In this study, we focus on 19 real-world statistics collected from authoritative sources. Using these statistics, we develop a checklist comprising objective and subjective queries to analyze behavior of large language models (LLMs) and text-to-image (T2I) models. Objective queries assess the models' ability to provide accurate world knowledge. In contrast, the design of subjective queries follows a key principle: statistical or experiential priors should not be overgeneralized to individuals, ensuring that models uphold diversity. These subjective queries are based on three common human cognitive errors that often result in social biases. We propose metrics to assess factuality and fairness, and formally prove the inherent trade-off between these two aspects. Results show that GPT-4o and DALL-E 3 perform notably well among six LLMs and four T2I models. Our code is publicly available at https://github.com/uclanlp/Fact-or-Fair.
Generating 3D Binding Molecules Using Shape-Conditioned Diffusion Models with Guidance
Chen, Ziqi, Peng, Bo, Zhai, Tianhua, Adu-Ampratwum, Daniel, Ning, Xia
Drug development is a critical but notoriously resource- and time-consuming process. In this manuscript, we develop a novel generative artificial intelligence (genAI) method DiffSMol to facilitate drug development. DiffSmol generates 3D binding molecules based on the shapes of known ligands. DiffSMol encapsulates geometric details of ligand shapes within pre-trained, expressive shape embeddings and then generates new binding molecules through a diffusion model. DiffSMol further modifies the generated 3D structures iteratively via shape guidance to better resemble the ligand shapes. It also tailors the generated molecules toward optimal binding affinities under the guidance of protein pockets. Here, we show that DiffSMol outperforms the state-of-the-art methods on benchmark datasets. When generating binding molecules resembling ligand shapes, DiffSMol with shape guidance achieves a success rate 61.4%, substantially outperforming the best baseline (11.2%), meanwhile producing molecules with novel molecular graph structures. DiffSMol with pocket guidance also outperforms the best baseline in binding affinities by 13.2%, and even by 17.7% when combined with shape guidance. Case studies for two critical drug targets demonstrate very favorable physicochemical and pharmacokinetic properties of the generated molecules, thus, the potential of DiffSMol in developing promising drug candidates.