Generative AI
Generative artificial intelligence improves projections of climate extremes
Tie, Ruian, Zhong, Xiaohui, Shi, Zhengyu, Li, Hao, Chen, Bin, Liu, Jun, Libo, Wu
Climate change is amplifying extreme weather and climate events worldwide [1]. Anthropogenic greenhouse gas emissions have disrupted the Earth's climate system, driving more frequent and severe heatwaves [2], cold spells [3], heavy precipitation [4], agricultural droughts [5], and tropical cyclones (TCs) [6]. Between 2016 and 2024, daily land temperature records show that extreme heat events occurred over four times more often than expected, while cold records declined by half [7]. These unprecedented shifts threaten human health [8, 9], infrastructure [10, 11], food security [12], biodiversity [13], and global economies [14, 15]. Therefore, reliable climate projections are essential for effective mitigation and adaptation strategies [16-18]. The Coupled Model Intercomparison Project (CMIP) [19] provides a foundation for global climate projections. Since its launch in 1995, CMIP has coordinated systematic evaluation of coupled general circulation models (GCMs). CMIP5 introduced Representative Concentration Pathways (RCPs), while CMIP6 extended this framework by incorporating Shared Socioeconomic Pathways (SSPs) through ScenarioMIP, enabling consistent simulations of emissions and socioeconomic trajectories to 2100 and facilitating integrated assessment of climate risks [20]. These advances have greatly enhanced the scientific and policy relevance of climate projections.
mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules
Edwards, Carl, Han, Chi, Lee, Gawon, Nguyen, Thao, Szymkuฤ, Sara, Prasad, Chetan Kumar, Jin, Bowen, Han, Jiawei, Diao, Ying, Liu, Ge, Peng, Hao, Grzybowski, Bartosz A., Burke, Martin D., Ji, Heng
Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs propose can often be challenging to make, and are almost never compatible with automated synthesis approaches. To better enable the discovery of functional small molecules, LLMs need to learn a new molecular language that is more effective in predicting properties and inherently synced with automated synthesis technology. Current molecule LLMs are limited by representing molecules based on atoms. In this paper, we argue that just like tokenizing texts into meaning-bearing (sub-)word tokens instead of characters, molecules should be tokenized at the level of functional building blocks, i.e., parts of molecules that bring unique functions and serve as effective building blocks for real-world automated laboratory synthesis. This motivates us to propose mCLM, a modular Chemical-Language Model that comprises a bilingual language model that understands both natural language descriptions of functions and molecular blocks. mCLM front-loads synthesizability considerations while improving the predicted functions of molecules in a principled manner. mCLM, with only 3B parameters, achieves improvements in synthetic accessibility relative to 7 other leading generative AI methods including GPT-5. When tested on 122 out-of-distribution medicines using only building blocks/tokens that are compatible with automated modular synthesis, mCLM outperforms all baselines in property scores and synthetic accessibility. mCLM can also reason on multiple functions and iteratively self-improve to rescue drug candidates that failed late in clinical trials ("fallen angels").
EvoCAD: Evolutionary CAD Code Generation with Vision Language Models
Preintner, Tobias, Yuan, Weixuan, Kรถnig, Adrian, Bรคck, Thomas, Raponi, Elena, van Stein, Niki
Abstract--Combining large language models with evolutionary computation algorithms represents a promising research direction leveraging the remarkable generative and in-context learning capabilities of LLMs with the strengths of evolutionary algorithms. Our method samples multiple CAD objects, which are then optimized using an evolutionary approach with vision language and reasoning language models. We assess our method using GPT -4V and GPT -4o, evaluating it on the CAD-Prompt benchmark dataset and comparing it to prior methods. Additionally, we introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects. Our results demonstrate that EvoCAD outperforms previous approaches on multiple metrics, particularly in generating topologically correct objects, which can be efficiently evaluated using our two novel metrics that complement existing spatial metrics. The use of generative AI tools powered by large language models (LLMs) has transformed the way humans work, create, and develop. However, while significant attention is directed towards textual knowledge tasks, comparatively little focus is devoted on working with symbolic representations, such as those utilized in computer-aided design (CAD). These code-like textual representations, in the following referred as CAD code, enable visual assets to be processed by LLMs [21].
Characterizing Web Search in The Age of Generative AI
Kirsten, Elisabeth, Perdekamp, Jost Grosse, Upadhyay, Mihir, Gummadi, Krishna P., Zafar, Muhammad Bilal
The advent of LLMs has given rise to a new type of web search: Generative search, where LLMs retrieve web pages related to a query and generate a single, coherent text as a response. This output modality stands in stark contrast to traditional web search, where results are returned as a ranked list of independent web pages. In this paper, we ask: Along what dimensions do generative search outputs differ from traditional web search? We compare Google, a traditional web search engine, with four generative search engines from two providers (Google and OpenAI) across queries from four domains. Our analysis reveals intriguing differences. Most generative search engines cover a wider range of sources compared to web search. Generative search engines vary in the degree to which they rely on internal knowledge contained within the model parameters v.s. external knowledge retrieved from the web. Generative search engines surface varying sets of concepts, creating new opportunities for enhancing search diversity and serendipity. Our results also highlight the need for revisiting evaluation criteria for web search in the age of Generative AI.
Who are you, ChatGPT? Personality and Demographic Style in LLM-Generated Content
Porat, Dana Sotto, Rabinovich, Ella
Generative large language models (LLMs) have become central to everyday life, producing human-like text across diverse domains. A growing body of research investigates whether these models also exhibit personality- and demographic-like characteristics in their language. In this work, we introduce a novel, data-driven methodology for assessing LLM personality without relying on self-report questionnaires, applying instead automatic personality and gender classifiers to model replies on open-ended questions collected from Reddit. Comparing six widely used models to human-authored responses, we find that LLMs systematically express higher Agreeableness and lower Neuroticism, reflecting cooperative and stable conversational tendencies. Gendered language patterns in model text broadly resemble those of human writers, though with reduced variation, echoing prior findings on automated agents. We contribute a new dataset of human and model responses, along with large-scale comparative analyses, shedding new light on the topic of personality and demographic patterns of generative AI.
Failure Prediction at Runtime for Generative Robot Policies
Rรถmer, Ralf, Kobras, Adrian, Worbis, Luca, Schoellig, Angela P.
Imitation learning (IL) with generative models, such as diffusion and flow matching, has enabled robots to perform complex, long-horizon tasks. However, distribution shifts from unseen environments or compounding action errors can still cause unpredictable and unsafe behavior, leading to task failure. Early failure prediction during runtime is therefore essential for deploying robots in human-centered and safety-critical environments. We propose FIPER, a general framework for Failure Prediction at Runtime for generative IL policies that does not require failure data. FIPER identifies two key indicators of impending failure: (i) out-of-distribution (OOD) observations detected via random network distillation in the policy's embedding space, and (ii) high uncertainty in generated actions measured by a novel action-chunk entropy score. Both failure prediction scores are calibrated using a small set of successful rollouts via conformal prediction. A failure alarm is triggered when both indicators, aggregated over short time windows, exceed their thresholds. We evaluate FIPER across five simulation and real-world environments involving diverse failure modes. Our results demonstrate that FIPER better distinguishes actual failures from benign OOD situations and predicts failures more accurately and earlier than existing methods. We thus consider this work an important step towards more interpretable and safer generative robot policies. Code, data and videos are available at https://tum-lsy.github.io/fiper_website.
Empirical Investigation of Latent Representational Dynamics in Large Language Models: A Manifold Evolution Perspective
This paper introduces the Dynamical Manifold Evolution Theory (DMET), a conceptual framework that models large language model (LLM) generation as a continuous trajectory evolving on a low-dimensional semantic manifold. The theory characterizes latent dynamics through three interpretable metrics-state continuity ($C$), attractor compactness ($Q$), and topological persistence ($P$)-which jointly capture the smoothness, stability, and structure of representation evolution. Empirical analyses across multiple Transformer architectures reveal consistent links between these latent dynamics and text quality: smoother trajectories correspond to greater fluency, and richer topological organization correlates with enhanced coherence. Different models exhibit distinct dynamical regimes, reflecting diverse strategies of semantic organization in latent space. Moreover, decoding parameters such as temperature and top-$p$ shape these trajectories in predictable ways, defining a balanced region that harmonizes fluency and creativity. As a phenomenological rather than first-principles framework, DMET provides a unified and testable perspective for interpreting, monitoring, and guiding LLM behavior, offering new insights into the interplay between internal representation dynamics and external text generation quality.
Modeling AI-Driven Production and Competitiveness A Multi-Agent Economic Simulation of China and the United States
MODELING AI-DRIVEN PRODUCTION AND COMPETITIVENESS: A MUL TI-AGENT ECONOMIC SIMULA TION OF CHINA AND THE UNITED ST A TES Y uxinyue Qian, Jun Liu Beijing University of Posts and Telecommunications liujun@bupt.edu.cn ABSTRACT With the rapid development of artificial intelligence (AI) technology, socio-economic systems are entering a new stage of "human-AI co-creation." Building upon a previously established multi-level intelligent agent economic model, this paper conducts simulation-based comparisons of macroeconomic output evolution in China and the United States under different mechanisms--AI collaboration, network effects, and AI autonomous production. The results show that: (1) when AI functions as an independent productive entity, the overall growth rate of social output far exceeds that of traditional human-labor-based models; (2) China demonstrates clear potential for acceleration in both the expansion of intelligent agent populations and the pace of technological catch-up, offering the possibility of achieving technological convergence or even partial surpassing. This study provides a systematic, model-based analytical framework for understanding AI-driven production system transformation and shifts in international competitiveness, as well as quantitative insights for relevant policy formulation. Comparison 1. INTRODUCTION Since the beginning of the 21st century, the rapid evolution of generative artificial intelligence (AI) and autonomous intelligent agents (AI agents) has profoundly reshaped the operating mechanisms of socioeconomic systems. Overall, the United States maintains a significant lead in core model development and capital investment.
DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism
Jiang, Chenyu, Cai, Zhenkun, Tian, Ye, Jia, Zhen, Wang, Yida, Wu, Chuan
Context parallelism has emerged as a key technique to support long-context training, a growing trend in generative AI for modern large models. However, existing context parallel methods rely on static parallelization configurations that overlook the dynamic nature of training data, specifically, the variability in sequence lengths and token relationships (i.e., attention patterns) across samples. As a result, these methods often suffer from unnecessary communication overhead and imbalanced computation. In this paper, we present DCP, a dynamic context parallel training framework that introduces fine-grained blockwise partitioning of both data and computation. By enabling flexible mapping of data and computation blocks to devices, DCP can adapt to varying sequence characteristics, effectively reducing communication and improving memory and computation balance. Micro-benchmarks demonstrate that DCP accelerates attention by 1.19x~2.45x under causal masks and 2.15x~3.77x under sparse attention patterns. Additionally, we observe up to 0.94x~1.16x end-to-end training speed-up for causal masks, and 1.00x~1.46x for sparse masks.
Personalized Motion Guidance Framework for Athlete-Centric Coaching
Takamidoa, Ryota, Suzukia, Chiharu, Nakamoto, Hiroki
A critical challenge in contemporary sports science lies in filling the gap between group-level insights derived from controlled hypothesis-driven experiments and the real-world need for personalized coaching tailored to individual athletes' unique movement patterns. This study developed a Personalized Motion Guidance Framework (PMGF) to enhance athletic performance by generating individualized motion-refinement guides using generative artificial intelligence techniques. PMGF leverages a vertical autoencoder to encode motion sequences into athlete-specific latent representations, which can then be directly manipulated to generate meaningful guidance motions. Two manipulation strategies were explored: (1) smooth interpolation between the learner's motion and a target (e.g., expert) motion to facilitate observational learning, and (2) shifting the motion pattern in an optimal direction in the latent space using a local optimization technique. The results of the validation experiment with data from 51 baseball pitchers revealed that (1) PMGF successfully generated smooth transitions in motion patterns between individuals across all 1,275 pitcher pairs, and (2) the features significantly altered through PMGF manipulations reflected known performance-enhancing characteristics, such as increased stride length and knee extension associated with higher ball velocity, indicating that PMGF induces biomechanically plausible improvements. We propose a future extension called general-PMGF to enhance the applicability of this framework. This extension incorporates bodily, environmental, and task constraints into the generation process, aiming to provide more realistic and versatile guidance across diverse sports contexts.