Goto

Collaborating Authors

 tracer


Affine Tracing: A New Paradigm for Probabilistic Linear Solvers

arXiv.org Machine Learning

Probabilistic linear solvers (PLSs) return probability distributions that quantify uncertainty due to limited computation in the solution of linear systems. The literature has traditionally distinguished between Bayesian PLSs, which condition a prior on information obtained from projections of the linear system, and probabilistic iterative methods (PIMs), which lift classical iterative solvers to probability space. In this work we show this dichotomy to be false: Bayesian PLSs are a special case of non-stationary affine PIMs. In addition, we prove that any realistic affine PIM is calibrated. These results motivate a focus on (non-stationary) affine PIMs, but their practical adoption has been limited by the significant manual effort required to implement them. To address this, we introduce affine tracing, an algorithmic framework that automatically constructs a PIM from a standard implementation of an affine iterative method by passing symbolic tracers through the computation to build an affine computational graph. We show how this graph can be transformed to compute posterior covariances, and how equality saturation can be used to perform algebraic simplifications required for computation under specific prior choices. We demonstrate the framework by automatically generating a probabilistic multigrid solver and evaluate its performance in the context of Gaussian process approximation.


Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions

Neural Information Processing Systems

Real-world offline datasets are often subject to data corruptions (such as noise or adversarial attacks) due to sensor failures or malicious attacks. Despite advances in robust offline reinforcement learning (RL), existing methods struggle to learn robust agents under high uncertainty caused by the diverse corrupted data (i.e., corrupted states, actions, rewards, and dynamics), leading to performance degradation in clean environments. To tackle this problem, we propose a novel robust variational Bayesian inference for offline RL (TRACER). It introduces Bayesian inference for the first time to capture the uncertainty via offline data for robustness against all types of data corruptions.




The Missing Parts: Augmenting Fact Verification with Half-Truth Detection

arXiv.org Artificial Intelligence

Fact verification systems typically assess whether a claim is supported by retrieved evidence, assuming that truthfulness depends solely on what is stated. However, many real-world claims are half-truths, factually correct yet misleading due to the omission of critical context. Existing models struggle with such cases, as they are not designed to reason about omitted information. We introduce the task of half-truth detection, and propose PolitiFact-Hidden, a new benchmark with 15k political claims annotated with sentence-level evidence alignment and inferred claim intent. To address this challenge, we present TRACER, a modular re-assessment framework that identifies omission-based misinformation by aligning evidence, inferring implied intent, and estimating the causal impact of hidden content. TRACER can be integrated into existing fact-checking pipelines and consistently improves performance across multiple strong baselines. Notably, it boosts Half-True classification F1 by up to 16 points, highlighting the importance of modeling omissions for trustworthy fact verification. The benchmark and code are available via https://github.com/tangyixuan/TRACER.


Prediction of steady states in a marine ecosystem model by a machine learning technique

arXiv.org Artificial Intelligence

We used precomputed steady states obtained by a spin-up for a global marine ecosystem model as training data to build a mapping from the small number of biogeochemical model parameters onto the three-dimensional converged steady annual cycle. The mapping was performed by a conditional variational autoencoder (CVAE) with mass correction. Applied for test data, we show that the prediction obtained by the CVAE already gives a reasonable good approximation of the steady states obtained by a regular spin-up. However, the predictions do not reach the same level of annual periodicity as those obtained in the original spin-up data. Thus, we took the predictions as initial values for a spin-up. We could show that the number of necessary iterations, corresponding to model years, to reach a prescribed stopping criterion in the spin-up could be significantly reduced compared to the use of the originally uniform, constant initial value. The amount of reduction depends on the applied stopping criterion, measuring the periodicity of the solution. The savings in needed iterations and, thus, computing time for the spin-up ranges from 50 to 95\%, depending on the stopping criterion for the spin-up. We compared these results with the use of the mean of the training data as an initial value. We found that this also accelerates the spin-up, but only by a much lower factor.


Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions

Neural Information Processing Systems

Real-world offline datasets are often subject to data corruptions (such as noise or adversarial attacks) due to sensor failures or malicious attacks. Despite advances in robust offline reinforcement learning (RL), existing methods struggle to learn robust agents under high uncertainty caused by the diverse corrupted data (i.e., corrupted states, actions, rewards, and dynamics), leading to performance degradation in clean environments. To tackle this problem, we propose a novel robust variational Bayesian inference for offline RL (TRACER). It introduces Bayesian inference for the first time to capture the uncertainty via offline data for robustness against all types of data corruptions. Then, to capture such uncertainty, it uses all offline data as the observations to approximate the posterior distribution of the action-value function under a Bayesian inference framework. An appealing feature of TRACER is that it can distinguish corrupted data from clean data using an entropy-based uncertainty measure, since corrupted data often induces higher uncertainty and entropy.


On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization

arXiv.org Artificial Intelligence

In this paper, we investigate the necessity of memorization in stochastic convex optimization (SCO) under $\ell_p$ geometries. Informally, we say a learning algorithm memorizes $m$ samples (or is $m$-traceable) if, by analyzing its output, it is possible to identify at least $m$ of its training samples. Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO. For every $p\in [1,\infty)$, we establish the existence of a risk threshold below which any sample-efficient learner must memorize a \em{constant fraction} of its sample. For $p\in [1,2]$, this threshold coincides with best risk of differentially private (DP) algorithms, i.e., above this threshold, there are algorithms that do not memorize even a single sample. This establishes a sharp dichotomy between privacy and traceability for $p \in [1,2]$. For $p \in (2,\infty)$, this threshold instead gives novel lower bounds for DP learning, partially closing an open problem in this setup. En route of proving these results, we introduce a complexity notion we term \em{trace value} of a problem, which unifies privacy lower bounds and traceability results, and prove a sparse variant of the fingerprinting lemma.


Remote Manipulation of Multiple Objects with Airflow Field Using Model-Based Learning Control

arXiv.org Artificial Intelligence

Non-contact manipulation is an emerging and highly promising methodology in robotics, offering a wide range of scientific and industrial applications. Among the proposed approaches, airflow stands out for its ability to project across considerable distances and its flexibility in actuating objects of varying materials, sizes, and shapes. However, predicting airflow fields at a distance, as well as the motion of objects within them, remains notoriously challenging due to their nonlinear and stochastic nature. Here, we propose a model-based learning approach using a jet-induced airflow field for remote multi-object manipulation on a surface. Our approach incorporates an analytical model of the field, learned object dynamics, and a model-based controller. The model predicts an air velocity field over an infinite surface for a specified jet orientation, while the object dynamics are learned through a robust system identification algorithm. Using the model-based controller, we can automatically and remotely, at meter-scale distances, control the motion of single and multiple objects for different tasks, such as path-following, aggregating, and sorting.


Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions

arXiv.org Artificial Intelligence

Real-world offline datasets are often subject to data corruptions (such as noise or adversarial attacks) due to sensor failures or malicious attacks. Despite advances in robust offline reinforcement learning (RL), existing methods struggle to learn robust agents under high uncertainty caused by the diverse corrupted data (i.e., corrupted states, actions, rewards, and dynamics), leading to performance degradation in clean environments. To tackle this problem, we propose a novel robust variational Bayesian inference for offline RL (TRACER). It introduces Bayesian inference for the first time to capture the uncertainty via offline data for robustness against all types of data corruptions. Specifically, TRACER first models all corruptions as the uncertainty in the action-value function. Then, to capture such uncertainty, it uses all offline data as the observations to approximate the posterior distribution of the action-value function under a Bayesian inference framework. An appealing feature of TRACER is that it can distinguish corrupted data from clean data using an entropy-based uncertainty measure, since corrupted data often induces higher uncertainty and entropy. Based on the aforementioned measure, TRACER can regulate the loss associated with corrupted data to reduce its influence, thereby enhancing robustness and performance in clean environments. Experiments demonstrate that TRACER significantly outperforms several state-of-the-art approaches across both individual and simultaneous data corruptions.