Learning Graphical Models
Learning Tractable Distributions Of Language Model Continuations
Yidou-Weng, Gwen, Li, Ian, Liu, Anji, Broadrick, Oliver, Broeck, Guy Van den, Wang, Benjie
Controlled language generation conditions text on sequence-level constraints (for example, syntax, style, or safety). These constraints may depend on future tokens, which makes directly conditioning an autoregressive language model (LM) generally intractable. Prior work uses tractable surrogates such as hidden Markov models (HMMs) to approximate the distribution over continuations and adjust the model's next-token logits at decoding time. However, we find that these surrogates are often weakly context aware, which reduces query quality. We propose Learning to Look Ahead (LTLA), a hybrid approach that pairs the same base language model for rich prefix encoding with a fixed tractable surrogate model that computes exact continuation probabilities. Two efficiency pitfalls arise when adding neural context: (i) naively rescoring the prefix with every candidate next token requires a sweep over the entire vocabulary at each step, and (ii) predicting fresh surrogate parameters for each prefix, although tractable at a single step, forces recomputation of future probabilities for every new prefix and eliminates reuse. LTLA avoids both by using a single batched HMM update to account for all next-token candidates at once, and by conditioning only the surrogate's latent state prior on the LM's hidden representations while keeping the surrogate decoder fixed, so computations can be reused across prefixes. Empirically, LTLA attains higher conditional likelihood than an unconditional HMM, approximates continuation distributions for vision-language models where a standalone HMM cannot encode visual context, and improves constraint satisfaction at comparable fluency on controlled-generation tasks, with minimal inference overhead.
CARE: Turning LLMs Into Causal Reasoning Expert
Dong, Juncheng, Liu, Yiling, Aloui, Ahmed, Tarokh, Vahid, Carlson, David
Large language models (LLMs) have recently demonstrated impressive capabilities across a range of reasoning and generation tasks. However, research studies have shown that LLMs lack the ability to identify causal relationships, a fundamental cornerstone of human intelligence. We first conduct an exploratory investigation of LLMs' behavior when asked to perform a causal-discovery task and find that they mostly rely on the semantic meaning of variable names, ignoring the observation data. This is unsurprising, given that LLMs were never trained to process structural datasets. To first tackle this challenge, we prompt the LLMs with the outputs of established causal discovery algorithms designed for observational datasets. These algorithm outputs effectively serve as the sufficient statistics of the observation data. However, quite surprisingly, we find that prompting the LLMs with these sufficient statistics decreases the LLMs' performance in causal discovery. To address this current limitation, we propose CARE, a framework that enhances LLMs' causal-reasoning ability by teaching them to effectively utilize the outputs of established causal-discovery algorithms through supervised fine-tuning. Experimental results show that a finetuned Qwen2.5-1.5B model produced by CARE significantly outperforms both traditional causal-discovery algorithms and state-of-the-art LLMs with over a thousand times more parameters, demonstrating effective utilization of its own knowledge and the external algorithmic clues.
I've Changed My Mind: Robots Adapting to Changing Human Goals during Collaboration
Ghose, Debasmita, Gitelson, Oz, Jin, Ryan, Abawe, Grace, Vazquez, Marynel, Scassellati, Brian
I've Changed My Mind: Robots Adapting to Changing Human Goals during Collaboration Abstract --For effective human-robot collaboration, a robot must align its actions with human goals, even as they change mid-task. Prior approaches often assume fixed goals, reducing goal prediction to a one-time inference. However, in real-world scenarios, humans frequently shift goals, making it challenging for robots to adapt without explicit communication. We propose a method for detecting goal changes by tracking multiple candidate action sequences and verifying their plausibility against a policy bank. Upon detecting a change, the robot refines its belief in relevant past actions and constructs Receding Horizon Planning (RHP) trees to actively select actions that assist the human while encouraging Differentiating Actions to reveal their updated goal. We evaluate our approach in a collaborative cooking environment with up to 30 unique recipes and compare it to three comparable human goal prediction algorithms. Our method outperforms all baselines, quickly converging to the correct goal after a switch, reducing task completion time and improving collaboration efficiency. N real-world scenarios, humans often change their goals in response to evolving circumstances, new information, or spontaneous decisions. Previous work often addresses changing human goals by relying on explicit communication [1], [2], [3]. While effective, relying on communication assumes humans can and will communicate with the robot, which is often impractical due to physical, situational, or cognitive constraints [4], [5], [6], [7], [8].
MACIE: Multi-Agent Causal Intelligence Explainer for Collective Behavior Understanding
As Multi Agent Reinforcement Learning systems are used in safety critical applications. Understanding why agents make decisions and how they achieve collective behavior is crucial. Existing explainable AI methods struggle in multi agent settings. They fail to attribute collective outcomes to individuals, quantify emergent behaviors, or capture complex interactions. We present MACIE Multi Agent Causal Intelligence Explainer, a framework combining structural causal models, interventional counterfactuals, and Shapley values to provide comprehensive explanations. MACIE addresses three questions. First, each agent's causal contribution using interventional attribution scores. Second, system level emergent intelligence through synergy metrics separating collective effects from individual contributions. Third, actionable explanations using natural language narratives synthesizing causal insights. We evaluate MACIE across four MARL scenarios: cooperative, competitive, and mixed motive. Results show accurate outcome attribution, mean phi_i equals 5.07, standard deviation less than 0.05, detection of positive emergence in cooperative tasks, synergy index up to 0.461, and efficient computation, 0.79 seconds per dataset on CPU. MACIE uniquely combines causal rigor, emergence quantification, and multi agent support while remaining practical for real time use. This represents a step toward interpretable, trustworthy, and accountable multi agent AI.
How many patients could we save with LLM priors?
Arai, Shota, Selby, David, Vargo, Andrew, Vollmer, Sebastian
Imagine a world where clinical trials need far fewer patients to achieve the same statistical power, thanks to the knowledge encoded in large language models (LLMs). We present a novel framework for hierarchical Bayesian modeling of adverse events in multi-center clinical trials, leveraging LLM-informed prior distributions. Unlike data augmentation approaches that generate synthetic data points, our methodology directly obtains parametric priors from the model. Our approach systematically elicits informative priors for hyperparameters in hierarchical Bayesian models using a pre-trained LLM, enabling the incorporation of external clinical expertise directly into Bayesian safety modeling. Through comprehensive temperature sensitivity analysis and rigorous cross-validation on real-world clinical trial data, we demonstrate that LLM-derived priors consistently improve predictive performance compared to traditional meta-analytical approaches. This methodology paves the way for more efficient and expert-informed clinical trial design, enabling substantial reductions in the number of patients required to achieve robust safety assessment and with the potential to transform drug safety monitoring and regulatory decision making.
From Static to Adaptive Defense: Federated Multi-Agent Deep Reinforcement Learning-Driven Moving Target Defense Against DoS Attacks in UAV Swarm Networks
Zhou, Yuyang, Cheng, Guang, Du, Kang, Chen, Zihan, Qin, Tian, Zhao, Yuyu
Abstract--The proliferation of unmanned aerial vehicles (UA Vs) has enabled a wide range of mission-critical applications and is becoming a cornerstone of low-altitude networks, supporting smart cities, emergency response, and more. However, the open wireless environment, dynamic topology, and resource constraints of UA Vs expose low-altitude networks to severe Denial-of-Service (DoS) threats, undermining their reliability and security. Traditional defense approaches, which rely on fixed configurations or centralized decision-making, cannot effectively respond to the rapidly changing conditions in UA V swarm environments. T o address these challenges, we propose a novel federated multi-agent deep reinforcement learning (FMADRL)- driven moving target defense (MTD) framework for proactive DoS mitigation in low-altitude networks. Specifically, we design lightweight and coordinated MTD mechanisms, including leader switching, route mutation, and frequency hopping, to disrupt attacker efforts and enhance network resilience. The defense problem is formulated as a multi-agent partially observable Markov decision process (POMDP), capturing the uncertain nature of UA V swarms under attack. Each UA V is equipped with a policy agent that autonomously selects MTD actions based on partial observations and local experiences. By employing a policy gradient-based FMADRL algorithm, UA Vs collaboratively optimize their policies via reward-weighted aggregation, enabling distributed learning without sharing raw data and thus reducing communication overhead. Extensive simulations demonstrate that our approach significantly outperforms state-of-the-art baselines, achieving up to a 34.6% improvement in attack mitigation rate, a reduction in average recovery time of up to 94.6%, and decreases in energy consumption and defense cost by as much as 29.3% and 98.3%, respectively, under various DoS attack strategies. These results highlight the potential of intelligent, distributed defense mechanisms to protect low-altitude networks, paving the way for reliable and scalable low-altitude economy. HE rapid development of unmanned aerial vehicle (UA V) technology [1] has enabled a wide range of applications, including environmental monitoring, disaster response, precision agriculture, logistics, aerial photography, and intelligent surveillance [2]. Y uyang Zhou, Guang Cheng, Kang Du, Zihan Chen, Tian Qin, and Y uyu Zhao are with the School of Cyber Science and Engineering, Southeast University, Purple Mountain Laboratories, and Jiangsu Province Engineering Research Center of Security for Ubiquitous Network, Nanjing 211189, China. Guang Cheng is the corresponding author. It is expected to play an increasingly important role in smart cities, emergency management, and next-generation communication infrastructures, forming the backbone of low-altitude networks. Nevertheless, the widespread adoption of UA V swarms also brings new security challenges [7], [8] to low-altitude networks.
A survey of using EHR as real-world evidence for discovering and validating new drug indications
Talukdar, Nabasmita, Zhang, Xiaodan, Paithankar, Shreya, Wang, Hui, Chen, Bin
Electronic Health Records (EHRs) have been increasingly used as real-world evidence (RWE) to support the discovery and validation of new drug indications. This paper surveys current approaches to EHR-based drug repurposing, covering data sources, processing methodologies, and representation techniques. It discusses study designs and statistical frameworks for evaluating drug efficacy. Key challenges in validation are discussed, with emphasis on the role of large language models (LLMs) and target trial emulation. By synthesizing recent developments and methodological advances, this work provides a foundational resource for researchers aiming to translate real-world data into actionable drug-repurposing evidence.
PoE-World: Compositional World Modeling with Products of Programmatic Experts
Piriyakulkij, Wasu Top, Liang, Yichao, Tang, Hao, Weller, Adrian, Kryven, Marta, Ellis, Kevin
Learning how the world works is central to building AI agents that can adapt to complex environments. Traditional world models based on deep learning demand vast amounts of training data, and do not flexibly update their knowledge from sparse observations. Recent advances in program synthesis using Large Language Models (LLMs) give an alternate approach which learns world models represented as source code, supporting strong generalization from little data. To date, application of program-structured world models remains limited to natural language and grid-world domains. We introduce a novel program synthesis method for effectively modeling complex, non-gridworld domains by representing a world model as an exponentially-weighted product of programmatic experts (PoE-World) synthesized by LLMs. We show that this approach can learn complex, stochastic world models from just a few observations. We evaluate the learned world models by embedding them in a model-based planning agent, demonstrating efficient performance and generalization to unseen levels on Atari's Pong and Montezuma's Revenge. We release our code and display the learned world models and videos of the agent's gameplay at https://topwasu.github.io/poe-world.
Bayesian Semiparametric Causal Inference: Targeted Doubly Robust Estimation of Treatment Effects
Sert, Gözde, Chakrabortty, Abhishek, Bhattacharya, Anirban
We propose a semiparametric Bayesian methodology for estimating the average treatment effect (ATE) within the potential outcomes framework using observational data with high-dimensional nuisance parameters. Our method introduces a Bayesian debiasing procedure that corrects for bias arising from nuisance estimation and employs a targeted modeling strategy based on summary statistics rather than the full data. These summary statistics are identified in a debiased manner, enabling the estimation of nuisance bias via weighted observables and facilitating hierarchical learning of the ATE. By combining debiasing with sample splitting, our approach separates nuisance estimation from inference on the target parameter, reducing sensitivity to nuisance model specification. We establish that, under mild conditions, the marginal posterior for the ATE satisfies a Bernstein-von Mises theorem when both nuisance models are correctly specified and remains consistent and robust when only one is correct, achieving Bayesian double robustness. This ensures asymptotic efficiency and frequentist validity. Extensive simulations confirm the theoretical results, demonstrating accurate point estimation and credible intervals with nominal coverage, even in high-dimensional settings. The proposed framework can also be extended to other causal estimands, and its key principles offer a general foundation for advancing Bayesian semiparametric inference more broadly.
Beyond Tsybakov: Model Margin Noise and $\mathcal{H}$-Consistency Bounds
We introduce a new low-noise condition for classification, the Model Margin Noise (MM noise) assumption, and derive enhanced $\mathcal{H}$-consistency bounds under this condition. MM noise is weaker than Tsybakov noise condition: it is implied by Tsybakov noise condition but can hold even when Tsybakov fails, because it depends on the discrepancy between a given hypothesis and the Bayes-classifier rather than on the intrinsic distributional minimal margin (see Figure 1 for an illustration of an explicit example). This hypothesis-dependent assumption yields enhanced $\mathcal{H}$-consistency bounds for both binary and multi-class classification. Our results extend the enhanced $\mathcal{H}$-consistency bounds of Mao, Mohri, and Zhong (2025a) with the same favorable exponents but under a weaker assumption than the Tsybakov noise condition; they interpolate smoothly between linear and square-root regimes for intermediate noise levels. We also instantiate these bounds for common surrogate loss families and provide illustrative tables.