automated discovery
X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates
Kim, Hyunjun, Ha, Junwoo, Yu, Sangyoon, Park, Haon
Multi-turn-to-single-turn (M2S) compresses iterative red-teaming into one structured prompt, but prior work relied on a handful of manually written templates. We present X-Teaming Evolutionary M2S, an automated framework that discovers and optimizes M2S templates through language-model-guided evolution. The system pairs smart sampling from 12 sources with an LLM-as-judge inspired by StrongREJECT and records fully auditable logs. Maintaining selection pressure by setting the success threshold to $θ= 0.70$, we obtain five evolutionary generations, two new template families, and 44.8% overall success (103/230) on GPT-4.1. A balanced cross-model panel of 2,500 trials (judge fixed) shows that structural gains transfer but vary by target; two models score zero at the same threshold. We also find a positive coupling between prompt length and score, motivating length-aware judging. Our results demonstrate that structure-level search is a reproducible route to stronger single-turn probes and underscore the importance of threshold calibration and cross-model evaluation. Code, configurations, and artifacts are available at https://github.com/hyunjun1121/M2S-x-teaming.
Automated Discovery of Adaptive Attacks on Adversarial Defenses
Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense's inner workings, or to approaches based on ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed from a set of reusable building blocks that can be formalized in a search space and used to automatically discover attacks for unknown defenses. We evaluated our approach on 24 adversarial defenses and show that it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses: our tool discovered significantly stronger attacks by producing 3.0%-50.8%
Automated Discovery of Functional Actual Causes in Complex Environments
Chuck, Caleb, Vaidyanathan, Sankaran, Giguere, Stephen, Zhang, Amy, Jensen, David, Niekum, Scott
Reinforcement learning (RL) algorithms often struggle to learn policies that generalize to novel situations due to issues such as causal confusion, overfitting to irrelevant factors, and failure to isolate control of state factors. These issues stem from a common source: a failure to accurately identify and exploit state-specific causal relationships in the environment. While some prior works in RL aim to identify these relationships explicitly, they rely on informal domain-specific heuristics such as spatial and temporal proximity. Actual causality offers a principled and general framework for determining the causes of particular events. However, existing definitions of actual cause often attribute causality to a large number of events, even if many of them rarely influence the outcome. Prior work on actual causality proposes normality as a solution to this problem, but its existing implementations are challenging to scale to complex and continuous-valued RL environments. This paper introduces functional actual cause (FAC), a framework that uses context-specific independencies in the environment to restrict the set of actual causes. We additionally introduce Joint Optimization for Actual Cause Inference (JACI), an algorithm that learns from observational data to infer functional actual causes. We demonstrate empirically that FAC agrees with known results on a suite of examples from the actual causality literature, and JACI identifies actual causes with significantly higher accuracy than existing heuristic methods in a set of complex, continuous-valued environments.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York (0.04)
- (3 more...)
- Law (0.67)
- Government (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
- (3 more...)
Automated discovery of interpretable gravitational-wave population models
Wong, Kaze W. K, Cranmer, Miles
We present an automatic approach to discover analytic population models for gravitational-wave (GW) events from data. As more gravitational-wave (GW) events are detected, flexible models such as Gaussian Mixture Models have become more important in fitting the distribution of GW properties due to their expressivity. However, flexible models come with many parameters that lack physical motivation, making interpreting the implication of these models challenging. In this work, we demonstrate symbolic regression can complement flexible models by distilling the posterior predictive distribution of such flexible models into interpretable analytic expressions. We recover common GW population models such as a power-law-plus-Gaussian, and find a new empirical population model which combines accuracy and simplicity. This demonstrates a strategy to automatically discover interpretable population models in the ever-growing GW catalog, which can potentially be applied to other astrophysical phenomena.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Maryland > Baltimore (0.04)
Automated Discovery of Adaptive Attacks on Adversarial Defenses
Yao, Chengyuan, Bielik, Pavol, Tsankov, Petar, Vechev, Martin
To address this challenge, two recent works approach the problem from different perspectives. Tramer et al. (2020) Reliable evaluation of adversarial defenses is a outlines an approach for manually crafting adaptive attacks challenging task, currently limited to an expert that exploit the weak points of each defense. Here, a domain who manually crafts attacks that exploit the defense's expert starts with an existing attack, such as PGD (Madry inner workings, or to approaches based et al., 2018) (denoted as - in Figure 1), and adapts it based on on ensemble of fixed attacks, none of which may knowledge of the defense's inner workings. Common modifications be effective for the specific defense at hand. Our include: (i) tuning attack parameters (e.g., number key observation is that custom attacks are composed of steps), (ii) replacing network components to simplify the from a set of reusable building blocks, attack (e.g., removing randomization or non-differentiable such as fine-tuning relevant attack parameters, network components), and (iii) replacing the loss function optimized transformations, and custom loss functions.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (11 more...)
Toward Automated Discovery in the Biological Sciences
Knowledge discovery programs in the biological sciences require flexibility in the use of symbolic data and semantic information. Because of the volume of nonnumeric, as well as numeric, data, the programs must be able to explore a large space of possibly interesting relationships to discover those that are novel and interesting. Thus, the framework for the discovery program must facilitate proposing and selecting the next task to perform and performing the selected tasks. The framework we describe, called the agenda- and justificationbased framework, has several properties that are desirable in semiautonomous discovery systems: It provides a mechanism for estimating the plausibility of tasks, it uses heuristics to propose and perform tasks, and it facilitates the encoding of general discovery strategies and the use of background knowledge. We have implemented the framework and our heuristics in a prototype program, HAMB, and have evaluated them in the domain of protein crystallization.
Towards Automated Discovery of Geometrical Theorems in GeoGebra
Kovács, Zoltán, Yu, Jonathan H.
We describe a prototype of a new experimental GeoGebra command and tool Discover that analyzes geometric figures for salient patterns, properties, and theorems. This tool is a basic implementation of automated discovery in elementary planar geometry. The paper focuses on the mathematical background of the implementation, as well as methods to avoid combinatorial explosion when storing the interesting properties of a geometric figure.
- North America > United States > Maryland > Baltimore (0.04)
- Europe > United Kingdom (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (4 more...)
Intrinsically Motivated Exploration for Automated Discovery of Patterns in Morphogenetic Systems
Reinke, Chris, Etcheverry, Mayalen, Oudeyer, Pierre-Yves
Exploration is a cornerstone both for machine learning algorithms and for science in general to discover novel solutions, phenomena and behaviors. Intrinsically motivated goal exploration processes (IMGEPs) were shown to enable autonomous agents to efficiently explore the diversity of effects they can produce on their environment. With IMGEPs, agents self-define their own experiments by imagining goals, then try to achieve them by leveraging their past discoveries. Progressively they learn which goals are achievable. IMGEPs were shown to enable efficient discovery and learning of diverse repertoires of skills in high-dimensional robots. In this article, we show that the IMGEP framework can also be used in an entirely different application area: automated discovery of self-organized patterns in complex morphogenetic systems. We also introduce a new IMGEP algorithm where goal representations are learned online and incrementally (past approaches used precollected training data with batch learning). For experimentation, we use Lenia, a continuous game-of-life cellular automaton. We study how IMGEPs enable to discover a variety of complex self-organized visual patterns. We compare random search and goal exploration methods with hand-defined, pretrained and online learned goal spaces. The results show that goal exploration methods identify more diverse patterns compared to random explorations. Moreover, the online learned goal spaces allow to successfully discover interesting patterns similar to the ones manually identified by human experts. Our results exemplify the ability of IMGEPs to discover novel structures and patterns in complex systems. We are optimistic that their application will aid the understanding and discovery of new knowledge in various domains of science and engineering.