scenario generation
Dark Speculation: Combining Qualitative and Quantitative Understanding in Frontier AI Risk Analysis
Carpenter, Daniel, Ezell, Carson, Mallick, Pratyush, Westray, Alexandria
Estimating catastrophic harms from frontier AI is hindered by deep ambiguity: many of its risks are not only unobserved but unanticipated by analysts. The central limitation of current risk analysis is the inability to populate the $\textit{catastrophic event space}$, or the set of potential large-scale harms to which probabilities might be assigned. This intractability is worsened by the $\textit{Lucretius problem}$, or the tendency to infer future risks only from past experience. We propose a process of $\textit{dark speculation}$, in which systematically generating and refining catastrophic scenarios ("qualitative" work) is coupled with estimating their likelihoods and associated damages (quantitative underwriting analysis). The idea is neither to predict the future nor to enable insurance for its own sake, but to use narrative and underwriting tools together to generate probability distributions over outcomes. We formalize this process using a simplified catastrophic Lévy stochastic framework and propose an iterative institutional design in which (1) speculation (including scenario planning) generates detailed catastrophic event narratives, (2) insurance underwriters assign probabilistic and financial parameters to these narratives, and (3) decision-makers synthesize the results into summary statistics to inform judgment. Analysis of the model reveals the value of (a) maintaining independence between speculation and underwriting, (b) analyzing multiple risk categories in parallel, and (c) generating "thick" catastrophic narrative rich in causal (counterfactual) and mitigative detail. While the approach cannot eliminate deep ambiguity, it offers a systematic approach to reason about extreme, low-probability events in frontier AI, tempering complacency and overreaction. The framework is adaptable for iterative use and can be further augmented with AI systems.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Alabama (0.04)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Banking & Finance > Insurance (1.00)
VLM as Strategist: Adaptive Generation of Safety-critical Testing Scenarios via Guided Diffusion
Wu, Xinzheng, Chen, Junyi, Zhong, Naiting, Shen, Yong
Autonomous driving technology is spearheading a transformation in the global automotive industries, and its safe and reliable implementation is the core prerequisite for large-scale adoption (Ren et al., 2025). Comprehensive testing and evaluation of autonomous driving systems (ADSs) are essential to ensuring their safety, in which the identification and generation of safety-critical scenarios represent a core challenge (Yang et al., 2025). "Safety-critical scenarios" specifically refer to rare driving situations with potentially high risks (Ding et al., 2023). Conducting tests under such scenarios enables effective evaluation of the ADSs' safety performance, as well as the clarification and iterative refinement of its Operational Design Domain (ODD). However, due to the rarity of safety-critical scenarios in naturalistic driving environments (Feng et al., 2023), real-world road testing is inefficient and cost-prohibitive, making it unsuitable for large-scale testing of high-level ADSs. As a more efficient and practical solution, simulation-based testing has garnered significant industrial and scholarly attention (Sun et al., 2022). In recent years, engineers in enterprises generally extract safety-critical testing scenarios by directly replaying vehicle-collected data in simulation environments (Liu et al., 2024), while some researchers achieve accelerated sampling of safety-critical scenarios through optimization-based search within a predefined scenario parameter space (Wu et al., 2024, 2026). However, the background vehicles (BVs) in the safety-critical testing scenarios generated by the aforementioned methods exhibit fixed behaviors and cannot dynamically respond to the actions of the vehicle under test (VUT). As a remedy, some other studies have introduced reinforcement learning to train adversarial BV driver models, thereby constructing naturalistic adversarial driving environments (NADE) (Feng et al., 2021) or evolving scenarios (Ma et al., 2024; Wu et al., 2025).
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- Asia > Singapore (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Research Report (1.00)
- Workflow (0.94)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
Learning from Risk: LLM-Guided Generation of Safety-Critical Scenarios with Prior Knowledge
Wang, Yuhang, Huang, Heye, Xu, Zhenhua, Sun, Kailai, Guo, Baoshen, Zhao, Jinhua
Autonomous driving faces critical challenges in rare long-tail events and complex multi-agent interactions, which are scarce in real-world data yet essential for robust safety validation. This paper presents a high-fidelity scenario generation framework that integrates a conditional variational autoencoder (CVAE) with a large language model (LLM). The CVAE encodes historical trajectories and map information from large-scale naturalistic datasets to learn latent traffic structures, enabling the generation of physically consistent base scenarios. This knowledge-driven optimization balances realism with controllability, ensuring that generated scenarios remain both plausible and risk-sensitive. Extensive experiments in CARLA and SMARTS demonstrate that our framework substantially increases the coverage of high-risk and long-tail events, improves consistency between simulated and real-world traffic distributions, and exposes autonomous driving systems to interactions that are significantly more challenging than those produced by existing rule-or data-driven methods. These results establish a new pathway for safety validation, enabling principled stress-testing of autonomous systems under rare but consequential events. Introduction The safety and reliability of autonomous driving depend on rigorous validation under diverse test conditions, especially in high-risk, highly interactive, and safety-critical scenarios (Wang et al., 2021; Hossain, 2025). Yet such events are extremely scarce in real-world datasets, creating a persistent gap between development testing and deployment needs. Simulation-based methods provide an effective alternative by generating large numbers of rare and adversarial environments, thereby alleviating data scarcity and enabling controlled safety evaluation (Huang et al., 2020). To address these challenges, this paper proposes a risk knowledge-guided traffic scene generation framework that integrates a Conditional Variational Autoencoder (CV AE) with a Large Language Model (LLM). Unlike prior works that merely sample or replay specific risky cases, the proposed framework establishes a general and controllable pipeline for synthesizing diverse safety-critical scenarios under varying risk conditions. The CVAE learns latent spatiotemporal representations from real-world trajectories and maps to generate physically coherent base scenes, while the LLM acts as a knowledge-driven controller that interprets scene semantics, analyzes multi-agent risk interactions, and dynamically adjusts optimization objectives to guide the generation toward desired levels of behavioral complexity and risk exposure.
- Asia > Singapore (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > China (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Towards AI-Assisted Generation of Military Training Scenarios
Hans, Soham, Ustun, Volkan, Nye, Benjamin, Sterrett, James, Green, Matthew
Achieving expert-level performance in simulation-based training relies on the creation of complex, adaptable scenarios, a traditionally laborious and resource intensive process. Although prior research explored scenario generation for military training, pre-LLM AI tools struggled to generate sufficiently complex or adaptable scenarios. This paper introduces a multi-agent, multi-modal reasoning framework that leverages Large Language Models (LLMs) to generate critical training artifacts, such as Operations Orders (OPORDs). We structure our framework by decomposing scenario generation into a hierarchy of subproblems, and for each one, defining the role of the AI tool: (1) generating options for a human author to select from, (2) producing a candidate product for human approval or modification, or (3) generating textual artifacts fully automatically. Our framework employs specialized LLM-based agents to address distinct subproblems. Each agent receives input from preceding subproblem agents, integrating both text-based scenario details and visual information (e.g., map features, unit positions and applies specialized reasoning to produce appropriate outputs. Subsequent agents process these outputs sequentially, preserving logical consistency and ensuring accurate document generation. This multi-agent strategy overcomes the limitations of basic prompting or single-agent approaches when tackling such highly complex tasks. We validate our framework through a proof-of-concept that generates the scheme of maneuver and movement section of an OPORD while estimating map positions and movements as a precursor demonstrating its feasibility and accuracy. Our results demonstrate the potential of LLM-driven multi-agent systems to generate coherent, nuanced documents and adapt dynamically to changing conditions, advancing automation in scenario generation for military training.
- North America > United States > Kansas > Leavenworth County > Leavenworth (0.04)
- North America > United States > California > San Diego County > Vista (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- Workflow (0.93)
- Research Report > New Finding (0.54)
- Government > Military > Army (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
AgentSUMO: An Agentic Framework for Interactive Simulation Scenario Generation in SUMO via Large Language Models
Jeong, Minwoo, Chang, Jeeyun, Yoon, Yoonjin
The growing complexity of urban mobility systems has made traffic simulation indispensable for evidence-based transportation planning and policy evaluation. However, despite the analytical capabilities of platforms such as the Simulation of Urban MObility (SUMO), their application remains largely confined to domain experts. Developing realistic simulation scenarios requires expertise in network construction, origin-destination modeling, and parameter configuration for policy experimentation, creating substantial barriers for non-expert users such as policymakers, urban planners, and city officials. Moreover, the requests expressed by these users are often incomplete and abstract-typically articulated as high-level objectives, which are not well aligned with the imperative, sequential workflows employed in existing language-model-based simulation frameworks. To address these challenges, this study proposes AgentSUMO, an agentic framework for interactive simulation scenario generation via large language models. AgentSUMO departs from imperative, command-driven execution by introducing an adaptive reasoning layer that interprets user intents, assesses task complexity, infers missing parameters, and formulates executable simulation plans. The framework is structured around two complementary components, the Interactive Planning Protocol, which governs reasoning and user interaction, and the Model Context Protocol, which manages standardized communication and orchestration among simulation tools. Through this design, AgentSUMO converts abstract policy objectives into executable simulation scenarios. Experiments on urban networks in Seoul and Manhattan demonstrate that the agentic workflow achieves substantial improvements in traffic flow metrics while maintaining accessibility for non-expert users, successfully bridging the gap between policy goals and executable simulation workflows.
- Asia > South Korea > Seoul > Seoul (0.25)
- North America > United States > New Jersey (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Workflow (1.00)
- Research Report > New Finding (0.93)
- Transportation > Ground > Road (1.00)
- Government (1.00)
- Transportation > Infrastructure & Services (0.93)
- Information Technology > Security & Privacy (0.93)
Adv-BMT: Bidirectional Motion Transformer for Safety-Critical Traffic Scenario Generation
Liu, Yuxin, Peng, Zhenghao, Cui, Xuanhao, Zhou, Bolei
Scenario-based testing is essential for validating the performance of autonomous driving (AD) systems. However, such testing is limited by the scarcity of long-tailed, safety-critical scenarios in existing datasets collected in the real world. To tackle the data issue, we propose the Adv-BMT framework, which augments real-world scenarios with diverse and realistic adversarial traffic interactions. The core component of Adv-BMT is a bidirectional motion transformer (BMT) model to perform inverse traffic motion predictions, which takes agent information in the last time step of the scenario as input, and reconstructs the traffic in the inverse of chronological order until the initial time step. The Adv-BMT framework is a two-staged pipeline: it first conducts adversarial initializations and then inverse motion predictions. Different from previous work, we do not need any collision data for pretraining, and are able to generate realistic and diverse collision interactions. Our experimental results validate the quality of generated collision scenarios by Adv-BMT: training in our augmented dataset would reduce episode collision rates by 20%. Demo and code are available at: https://metadriverse.github.io/adv-bmt/.
- Research Report (0.82)
- Workflow (0.68)
- Information Technology (0.67)
- Transportation > Ground > Road (0.35)
Driving scenario generation and evaluation using a structured layer representation and foundational models
Hubert, Arthur, Elghazaly, Gamal, Frank, Raphaël
Rare and challenging driving scenarios are critical for autonomous vehicle development. Since they are difficult to encounter, simulating or generating them using generative models is a popular approach. Following previous efforts to structure driving scenario representations in a layer model, we propose a structured five-layer model to improve the evaluation and generation of rare scenarios. We use this model alongside large foundational models to generate new driving scenarios using a data augmentation strategy. Unlike previous representations, our structure introduces subclasses and characteristics for every agent of the scenario, allowing us to compare them using an embedding specific to our layer-model. We study and adapt two metrics to evaluate the relevance of a synthetic dataset in the context of a structured representation: the diversity score estimates how different the scenarios of a dataset are from one another, while the originality score calculates how similar a synthetic dataset is from a real reference set. This paper showcases both metrics in different generation setup, as well as a qualitative evaluation of synthetic videos generated from structured scenario descriptions. The code and extended results can be found at https://github.com/Valgiz/5LMSG.
Multi-Agent Scenario Generation in Roundabouts with a Transformer-enhanced Conditional Variational Autoencoder
Li, Li, Brinkmann, Tobias, Temmen, Till, Eisenbarth, Markus, Andert, Jakob
With the increasing integration of intelligent driving functions into serial-produced vehicles, ensuring their functionality and robustness poses greater challenges. Compared to traditional road testing, scenario-based virtual testing offers significant advantages in terms of time and cost efficiency, reproducibility, and exploration of edge cases. We propose a Transformer-enhanced Conditional Variational Autoencoder (CVAE-T) model for generating multi-agent traffic scenarios in roundabouts, which are characterized by high vehicle dynamics and complex layouts, yet remain relatively underexplored in current research. The results show that the proposed model can accurately reconstruct original scenarios and generate realistic, diverse synthetic scenarios. Besides, two Key-Performance-Indicators (KPIs) are employed to evaluate the interactive behavior in the generated scenarios. Analysis of the latent space reveals partial disentanglement, with several latent dimensions exhibiting distinct and interpretable effects on scenario attributes such as vehicle entry timing, exit timing, and velocity profiles. The results demonstrate the model's capability to generate scenarios for the validation of intelligent driving functions involving multi-agent interactions, as well as to augment data for their development and iterative improvement.
- North America > United States > Michigan (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
- Europe > France (0.04)
SynAD: Enhancing Real-World End-to-End Autonomous Driving Models through Synthetic Data Integration
Kim, Jongsuk, Lee, Jaeyoung, Han, Gyojin, Lee, Dongjae, Jeong, Minki, Kim, Junmo
Recent advancements in deep learning and the availability of high-quality real-world driving datasets have propelled end-to-end autonomous driving. Despite this progress, relying solely on real-world data limits the variety of driving scenarios for training. Synthetic scenario generation has emerged as a promising solution to enrich the diversity of training data; however, its application within E2E AD models remains largely unexplored. This is primarily due to the absence of a designated ego vehicle and the associated sensor inputs, such as camera or LiDAR, typically provided in real-world scenarios. To address this gap, we introduce SynAD, the first framework designed to enhance real-world E2E AD models using synthetic data. Our method designates the agent with the most comprehensive driving information as the ego vehicle in a multi-agent synthetic scenario. We further project path-level scenarios onto maps and employ a newly developed Map-to-BEV Network to derive bird's-eye-view features without relying on sensor inputs. Finally, we devise a training strategy that effectively integrates these map-based synthetic data with real driving data. Experimental results demonstrate that SynAD effectively integrates all components and notably enhances safety performance. By bridging synthetic scenario generation and E2E AD, SynAD paves the way for more comprehensive and robust autonomous driving models.
- Transportation > Ground > Road (0.91)
- Information Technology > Robotics & Automation (0.82)
Controllable Collision Scenario Generation via Collision Pattern Prediction
Chen, Pin-Lun, Kung, Chi-Hsi, Chang, Che-Han, Chiu, Wei-Chen, Chen, Yi-Ting
Evaluating the safety of autonomous vehicles (AVs) requires diverse, safety-critical scenarios, with collisions being especially important yet rare and unsafe to collect in the real world. Therefore, the community has been focusing on generating safety-critical scenarios in simulation. However, controlling attributes such as collision type and time-to-accident (TTA) remains challenging. We introduce a new task called controllable collision scenario generation, where the goal is to produce trajectories that realize a user-specified collision type and TTA, to investigate the feasibility of automatically generating desired collision scenarios. To support this task, we present COLLIDE, a large-scale collision scenario dataset constructed by transforming real-world driving logs into diverse collisions, balanced across five representative collision types and different TTA intervals. We propose a framework that predicts Collision Pattern, a compact and interpretable representation that captures the spatial configuration of the ego and the adversarial vehicles at impact, before rolling out full adversarial trajectories. Experiments show that our approach outperforms strong baselines in both collision rate and controllability. Furthermore, generated scenarios consistently induce higher planner failure rates, revealing limitations of existing planners. We demonstrate that these scenarios fine-tune planners for robustness improvements, contributing to safer AV deployment in different collision scenarios. Project page is available at https://submit-user.github.io/anon2025
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
- Asia > Taiwan (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
- Information Technology > Artificial Intelligence > Natural Language (0.93)