Genre
KAIROS: Scalable Model-Agnostic Data Valuation
Data valuation techniques quantify each training example's contribution to model performance, providing a principled basis for data cleaning, acquisition, and selection. Existing valuation methods remain inadequate: \emph{model-based} techniques depend on a single fitted model and inherit its biases, while \emph{algorithm-based} approaches like Data Shapley scale poorly due to their need to train multiple models. Recent work has proposed model-agnostic alternatives based on Wasserstein distance between the training set and a clean reference set, but exact computation is expensive and approximations often misrank examples. We introduce KAIROS, a model-agnostic framework that values examples by their contribution to the Maximum Mean Discrepancy (MMD) between the training set and a clean reference distribution. Unlike Wasserstein methods, MMD admits a closed-form solution that requires no approximations and is scalable to large datasets. Additionally, KAIROS enables efficient online valuation: adding a new batch of $m$ examples requires only $O(mN)$ computation to update all scores, compared to $O(N^2)$ in prior work where $N$ is the training set size. Empirical evaluations on noise, mislabeling, and poisoning benchmarks show that KAIROS consistently outperforms state-of-the-art baselines in both accuracy and runtime. On ImageNet, KAIROS achieves up to 15 $\times$ speedup over the fastest baseline while maintaining superior data valuation quality. Our results demonstrate that model-agnostic methods can match or exceed model-based approaches in performance while scaling to large datasets.
T-norm Selection for Object Detection in Autonomous Driving with Logical Constraints
Integrating logical constraints into object detection models for autonomous driving (AD) is a promising way to enhance their compliance with rules and thereby increase the safety of the system. T-norms have been utilized to calculate the constrained loss, i.e., the violations of logical constraints as losses. While prior works have statically selected a few t-norms, we conduct an extensive experimental study to identify the most effective choices, as suboptimal t-norms can lead to undesired model behavior. To this end, we present MOD-ECL, a neurosymbolic framework that implements a wide range of t-norms and applies them in an adaptive manner. It includes an algorithm that selects well-performing t-norms during training and a scheduler that regulates the impact of the constrained loss. We evaluate its effectiveness on the ROAD-R and ROAD-Waymo-R datasets for object detection in AD, using attached common-sense constraints. Our results show that careful selection of parameters is crucial for effective constrained loss behavior. Moreover, our framework not only reduces constraint violations but also, in some cases, improves detection performance. Additionally, our methods offer fine-grained control over the trade-off between accuracy and constraint violation.
Bridging Brains and Concepts: Interpretable Visual Decoding from fMRI with Semantic Bottlenecks
Decoding of visual stimuli from noninvasive neuroimaging techniques such as functional magnetic resonance (fMRI) has advanced rapidly in the last years; yet, most high-performing brain decoding models rely on complicated, non-interpretable latent spaces. In this study we present an interpretable brain decoding framework that inserts a semantic bottleneck into BrainDiffuser, a well established, simple and linear decoding pipeline. We firstly produce a $214-\text{dimensional}$ binary interpretable space $\mathcal{L}$ for images, in which each dimension answers to a specific question about the image (e.g., Is there a person?, Is it outdoors?).
MixAT: Combining Continuous and Discrete Adversarial Training for LLMs
Despite recent efforts in Large Language Model (LLM) safety and alignment, current adversarial attacks on frontier LLMs can still consistently force harmful generations. Although adversarial training has been widely studied and shown to significantly improve the robustness of traditional machine learning models, its strengths and weaknesses in the context of LLMs are less understood. Specifically, while existing discrete adversarial attacks are effective at producing harmful content, training LLMs with concrete adversarial prompts is often computationally expensive, leading to reliance on continuous relaxations. At the same time, despite their effectiveness and generalization capabilities, training with continuous perturbations does not always capture the full spectrum of vulnerabilities exploited by discrete attacks. In this work, we aim to bridge this gap by introducing MIXAT, a novel method that combines stronger discrete and faster continuous attacks during training. We rigorously evaluate MIXAT across a wide spectrum of state-of-the-art attacks, proposing the *At Least One Attack Success Rate* (ALO-ASR) metric to capture the worst-case vulnerability of models. We show MIXAT achieves substantially better robustness (ALO-ASR $ < 20\%$) compared to prior defenses (ALO-ASR $> 50\%$), while maintaining a runtime comparable to methods based on continuous relaxations. We further analyze MIXAT in realistic deployment settings, exploring how chat templates, quantization, low-rank adapters, and temperature affect both adversarial training and evaluation, revealing additional blind spots in current methodologies. Our results demonstrate that MIXAT discrete-continuous defense offers a principled and superior robustness-accuracy tradeoff with minimal computational overhead, highlighting its promise for building safer LLMs.
Stochastic Gradients under Nuisances
Stochastic gradient optimization is the dominant learning paradigm for a variety of scenarios, from classical supervised learning to modern self-supervised learning. We consider stochastic gradient algorithms for learning problems whose objectives rely on unknown nuisance parameters, and establish non-asymptotic convergence guarantees. Our results show that, while the presence of a nuisance can alter the optimum and upset the optimization trajectory, the classical stochastic gradient algorithm may still converge under appropriate conditions, such as Neyman orthogonality. Moreover, even when Neyman orthogonality is not satisfied, we also show that an algorithm variant with approximately orthogonalized updates (with an approximately orthogonalized gradient oracle) may achieve similar convergence rates. Examples from orthogonal statistical learning/double machine learning and causal inference are discussed.
Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values
The growing interest in employing large language models (LLMs) for decision-making in social and economic contexts has raised questions about their potential to function as agents in these domains. A significant number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes. In this paper, we examine whether LLM responses adhere to fundamental fairness concepts such as equitability, envy-freeness, and Rawlsian maximin, and investigate their alignment with human preferences. We evaluate the performance of several LLMs, providing a comparative benchmark of their ability to reflect these measures. Our results demonstrate a lack of alignment between current LLM responses and human distributional preferences. Moreover, LLMs are unable to utilize money as a transferable resource to mitigate inequality. Nonetheless, we demonstrate a stark contrast when (some) LLMs are tasked with selecting from a predefined menu of options rather than generating one. In addition, we analyze the robustness of LLM responses to variations in semantic factors (e.g.
FIGRDock: Fast Interaction-Guided Regression for Flexible Docking
Flexible docking, which predicts the binding conformations of both proteins and small molecules by modeling their structural flexibility, plays a vital role in structure-based drug design. Although recent generative approaches, particularly diffusion-based models, have shown promising results, they require iterative sampling to generate candidate structures and depend on separate scoring functions for pose selection. This leads to an inefficient pipeline that is difficult to scale in real-world drug discovery workflows. To overcome these challenges, we introduce FIGRDock, a fast and accurate flexible docking framework that understands complicated interactions between molecules and proteins with a regression-based approach. FIGRDock leverages initial docking poses from conventional tools to distill interaction-aware distance patterns, which serve as explicit structural conditions to directly guide the prediction of the final protein-ligand complex via a regression model. This one-shot inference paradigm enables rapid and precise pose prediction without reliance on multi-step sampling or external scoring stages. Experimental results show that FIGRDock achieves up to 100 faster inference than diffusion-based docking methods, while consistently surpassing them in accuracy across standard benchmarks. These results suggest that FIGRDock has the potential to offer a scalable and efficient solution for flexible docking, advancing the pace of structure-based drug discovery.
Generalizable Reasoning through Compositional Energy Minimization
Generalization is a key challenge in machine learning, specifically in reasoning tasks, where models are expected to solve problems more complex than those encountered during training. Existing approaches typically train reasoning models in an end-to-end fashion, directly mapping input instances to solutions. While this allows models to learn useful heuristics from data, it often results in limited generalization beyond the training distribution. In this work, we propose a novel approach to reasoning generalization by learning energy landscapes over the solution spaces of smaller, more tractable subproblems. At test time, we construct a global energy landscape for a given problem by combining the energy functions of multiple subproblems. This compositional approach enables the incorporation of additional constraints during inference, allowing the construction of energy landscapes for problems of increasing difficulty. To improve the sample quality from this newly constructed energy landscape, we introduce Parallel Energy Minimization (PEM). We evaluate our approach on a wide set of reasoning problems. Our method outperforms existing state-of-the-art methods, demonstrating its ability to generalize to larger and more complex problems.
Anatomically inspired digital twins capture hierarchical object representations in visual cortex
Invariant object recognition-the ability to identify objects despite changes in appearance-is a hallmark of visual processing in the brain, yet its understanding remains a central challenge in systems neuroscience. Artificial neural networks trained to predict neural responses to visual stimuli ("digital twins") could provide a powerful framework for studying such complex computations in silico. However, while current models accurately capture single-neuron responses within individual visual areas, their ability to reproduce how populations of neurons represent object identity, and how these representations transform across the cortical hierarchy, remains largely unexplored. Here we examine key functional signatures observed experimentally and find that current models account for hierarchical changes in basic single-neuron properties, such as receptive field size, but fail to capture more complex population-level phenomena, particularly invariant object representations. To address this gap, we introduce a biologically inspired hierarchical readout scheme that mirrors cortical anatomy, modeling each visual area as a projection from a distinct depth within a shared core network. This approach significantly improves the prediction of population-level representational transformations, outperforming standard models that use only the final layer, as well as alternatives with modified architecture, regularization, and loss function. Our results suggest that incorporating anatomical information provides a strong inductive bias in digital twin models, enabling them to better capture general principles of brain function.
Role Bias in Diffusion Models: Diagnosing and Mitigating through Intermediate Decomposition
In this work, we introduce RoleBench, a benchmark focused on evaluating compositional generalization in action-based relations (e.g., mouse chasing cat). We show that state-of-the-art T2I models and compositional generation methods consistently default to frequent reversed relations (i.e., cat chasing mouse), a phenomenon we call role collapse. Related works attribute this to the model's architectural limitation or underrepresentation in the data. Our key insight reveals that while models fail on rare compositions when their inversions are common, they can successfully generate similar intermediate compositions (e.g., mouse chasing boy), suggesting that this limitation is also due to the presence of frequent counterparts rather than just the absence of rare compositions. Motivated by this, we hypothesize that directional decomposition can gradually mitigate role collapse. We test this via ReBind, a lightweight framework that teaches role bindings using carefully selected active/passive intermediate compositions. Experiments suggest that intermediate compositions through simple fine-tuning can significantly reduce role collapse, with humans preferring ReBind more than 78% compared to state-of-the-art methods. Our findings highlight the role of distributional asymmetries in compositional failures and offer a simple, effective path for improving generalization.