interference
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
However, existing methods face two critical challenges: parameter interference among tasks, which leads to catastrophic forgetting, and limited adaptability to evolving test distributions. To address these issues, we introduce the task of Test-Time Continual Model Merging (TTCMM), which leverages a small set of unlabeled test samples during inference to alleviate parameter conflicts and handle distribution shifts. We propose MINGLE, a novel framework for TTCMM. MINGLE employs a mixture-of-experts architecture with parameter-efficient, low-rank experts, which enhances adaptability to evolving test distributions while dynamically merging models to mitigate conflicts. To further reduce forgetting, we propose Null-Space Constrained Gating, which restricts gating updates to subspaces orthogonal to prior task representations, thereby suppressing activations on old tasks and preserving past knowledge. We further introduce an Adaptive Relaxation Strategy that adjusts constraint strength dynamically based on interference signals observed during test-time adaptation, striking a balance between stability and adaptability. Extensive experiments on standard continual merging benchmarks demonstrate that MINGLE achieves robust generalization, significantly reduces forgetting, and consistently surpasses previous state-of-the-art methods by 7-9% on average across diverse task orders.
Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference
In multi-armed bandits with network interference (MABNI), the action taken by one node can influence the rewards of others, creating complex interdependence. While existing research on MABNI largely concentrates on minimizing regret, it often overlooks the crucial concern that an excessive emphasis on the optimal arm can undermine the inference accuracy for sub-optimal arms. Although initial efforts have been made to address this trade-off in single-unit scenarios, these challenges have become more pronounced in the context of MABNI. In this paper, we establish, for the first time, a theoretical Pareto frontier characterizing the trade-off between regret minimization and inference accuracy in adversarial (design-based) MABNI. We further introduce an anytime-valid asymptotic confidence sequence along with a corresponding algorithm, EXP3-N-CS, specifically designed to balance the trade-off between regret minimization and inference accuracy in this setting.
Large Language Models as Model Organisms for Human Associative Learning
Testing hypotheses on how representational changes occur in biological systems is challenging, but large language models (LLMs) offer a scalable alternative. Building on LLMs' in-context learning, we adapt a cognitive neuroscience associative learning paradigm and investigate how representations evolve across six models. Our initial findings reveal a non-monotonic pattern consistent with the Non-Monotonic Plasticity Hypothesis, with moderately similar items differentiating after learning. Leveraging the controllability of LLMs, we further show that this differentiation is modulated by the overlap of associated items with the broader vocabulary-a factor we term vocabulary interference, capturing how new associations compete with prior knowledge. We find that higher vocabulary interference amplifies differentiation, suggesting that representational change is influenced by both item similarity and global competition.
ABayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
Given the ever-changing nature of the world and its inhabitants, agents must possess the ability to adapt and evolve over time. Recent research in Given the ever-changing nature of the world and its inhabitants, agents must possess the ability to adapt and evolve over time. Recent research in non-stationary MDPs has focused on addressing this challenge, providing algorithms inspired by task inference techniques. However, these methods ignore the detrimental effects of interference, which particularly harm performance in contradictory tasks, leading to low efficiency in some environments. To address this issue, we propose a Bayesian Fast-Slow Framework (BFSF) that tackles both cross-task generalization and resistance to cross-task interference.
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
Fine-tuning pre-trained models with custom data leads to numerous expert models on specific tasks. Merging models into one universal model to empower multi-task ability refraining from data leakage has gained popularity. With the expansion in data and model size, parameter-efficient tuning becomes the common practice for obtaining task-specific models efficiently. However, few methods are dedicated to efficient merging, and existing methods designed for full fine-tuning merging fail under efficient merging. To address the issue, we analyze from low-rank decomposition and reveal that direction robustness during merging is crucial for merging efficient modules. We furthermore uncover that compensating for the gap between stark singular values contributes to direction robustness. Therefore, we propose RobustMerge, a training-free parameter-efficient merging method with complementary parameter adaptation to maintain direction robustness. Specifically, we (1) prune parameters and scale coefficients from inter-parameter relations for singular values to maintain direction stability away from task interference, and (2) perform cross-task normalization to enhance unseen task generalization. We establish a benchmark consisting of diverse multimodal tasks, on which we conduct experiments to certify the outstanding performance and generalizability of our method.
GST-UNet: ANeural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding
Estimating causal effects from spatiotemporal observational data is essential in public health, environmental science, and policy evaluation, where randomized experiments are often infeasible. Existing approaches, however, either rely on strong structural assumptions or fail to handle key challenges such as interference, spatial confounding, temporal carryover, and time-varying confounding--where covariates are influenced by past treatments and, in turn, affect future ones. We introduce the GST-UNet (G-computation Spatio-Temporal UNet), a theoretically grounded neural framework that combines a U-Net-based spatiotemporal encoder with regression-based iterative G-computation to estimate location-specific potential outcomes under complex intervention sequences. GST-UNet explicitly adjusts for time-varying confounders and captures non-linear spatial and temporal dependencies, enabling valid causal inference from a single observed trajectory in data-scarce settings.
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
Adaptive cooperation in multi-agent reinforcement learning (MARL) requires policies to express homogeneous, specialised, or mixed behaviours, yet achieving this adaptivity remains a critical challenge. While parameter sharing (PS) is standard for efficient learning, it notoriously suppresses the behavioural diversity required for specialisation. This failure is largely due to cross-agent gradient interference, a problem we find is surprisingly exacerbated by the common practice of . Existing remedies typically add complexity through altered objectives, manual preset diversity levels, or sequential updates -- raising a fundamental question: We propose a solution built on a key insight: an agent-conditioned hypernetwork can generate agent-specific parameters and observation-and agent-conditioned gradients, directly countering the interference from coupling agent IDs with observations.
A Bayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
Given the ever-changing nature of the world and its inhabitants, agents must possess the ability to adapt and evolve over time. Recent research in Given the ever-changing nature of the world and its inhabitants, agents must possess the ability to adapt and evolve over time. Recent research in non-stationary MDPs has focused on addressing this challenge, providing algorithms inspired by task inference techniques. However, these methods ignore the detrimental effects of interference, which particularly harm performance in contradictory tasks, leading to low efficiency in some environments. To address this issue, we propose a Bayesian Fast-Slow Framework (BFSF) that tackles both cross-task generalization and resistance to cross-task interference. Our framework consists of two components: a'fast' policy, learned from recent data, and a'slow' policy, learned through meta-reinforcement learning (meta-RL) using data from all previous tasks. A Bayesian estimation mechanism determines the current choice of'fast' or'slow' policy, balancing exploration and exploitation. Additionally, in the'fast' policy, we introduce a dual-reset mechanism and a data relabeling technique to further accelerate convergence when encountering new tasks. Experiments demonstrate that our algorithm effectively mitigates interference and outperforms baseline approaches.
Learning to target with network interference
Wang, Xiaomeng, Bastani, Hamsa, Bastani, Osbert, Ren, Zhimei
This paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each individual's outcome can be affected by at most a few others. We first establish a regret lower bound showing that ignoring the network structure and reducing the problem to a standard linear bandit inevitably leads to inefficient learning, particularly in large populations. To understand how structural information can be leveraged, we analyze regimes with varying levels of knowledge of the interference structure: (1) full support knowledge, (2) knowledge of the column support sizes, and (3) no prior knowledge. For each regime, we establish regret lower bounds characterizing the fundamental limits of learning, and develop algorithms that achieve near-optimal regret. Together, our results provide a unified view of how knowledge of the interference structure governs the efficiency of online learning under interference, and offer practical adaptive targeting algorithms in each setting. Numerical experiments on synthetic and real-world data demonstrate the practical benefits of our algorithms.
Detectability in Diversity: Improved Canary Crafting for Privacy Auditing in One Run
Dagréou, Mathieu, Bellet, Aurélien
Privacy auditing aims to empirically assess privacy leakage in machine learning models using membership inference attacks (MIAs), and to derive lower bounds on differential privacy (DP) parameters. Recent one-run auditing methods address the high cost of standard approaches by relying on a single training run with multiple "canary" points whose inclusion or exclusion must be detected by the auditor. In this work, we study the problem of efficiently crafting canaries for one-run privacy auditing. Motivated by recent theoretical insights suggesting that interference between canaries contributes to weaker leakage estimates compared to multi-run methods, we propose to optimize canaries to be both highly detectable and minimally interfering. Our approach combines a greedy initialization based on influence functions with a bilevel optimization procedure that maximizes distinguishability while promoting diversity in embedding space, enabling the use of computationally efficient bilevel algorithms. Experiments show that our method achieves stronger privacy leakage estimates at a lower computational cost than existing canary crafting approaches.