Goto

Collaborating Authors

 Uncertainty


Semi-Supervised Regression with Heteroscedastic Pseudo-Labels

arXiv.org Artificial Intelligence

Pseudo-labeling is a commonly used paradigm in semi-supervised learning, yet its application to semi-supervised regression (SSR) remains relatively under-explored. Unlike classification, where pseudo-labels are discrete and confidence-based filtering is effective, SSR involves continuous outputs with heteroscedastic noise, making it challenging to assess pseudo-label reliability. As a result, naive pseudo-labeling can lead to error accumulation and overfitting to incorrect labels. To address this, we propose an uncertainty-aware pseudo-labeling framework that dynamically adjusts pseudo-label influence from a bi-level optimization perspective. By jointly minimizing empirical risk over all data and optimizing uncertainty estimates to enhance generalization on labeled data, our method effectively mitigates the impact of unreliable pseudo-labels. We provide theoretical insights and extensive experiments to validate our approach across various benchmark SSR datasets, and the results demonstrate superior robustness and performance compared to existing methods. Our code is available at https://github.com/sxq/Heteroscedastic-Pseudo-Labels.


Recursive Inference for Heterogeneous Multi-Output GP State-Space Models with Arbitrary Moment Matching

arXiv.org Machine Learning

Accurate learning of system dynamics is becoming increasingly crucial for advanced control and decision-making in engineering. However, real-world systems often exhibit multiple channels and highly nonlinear transition dynamics, challenging traditional modeling methods. To enable online learning for these systems, this paper formulates the system as Gaussian process state-space models (GPSSMs) and develops a recursive learning method. The main contributions are threefold. First, a heterogeneous multi-output kernel is designed, allowing each output dimension to adopt distinct kernel types, hyperparameters, and input variables, improving expressiveness in multi-dimensional dynamics learning. Second, an inducing-point management algorithm enhances computational efficiency through independent selection and pruning for each output dimension. Third, a unified recursive inference framework for GPSSMs is derived, supporting general moment matching approaches, including the extended Kalman filter (EKF), unscented Kalman filter (UKF), and assumed density filtering (ADF), enabling accurate learning under strong nonlinearity and significant noise. Experiments on synthetic and real-world datasets show that the proposed method matches the accuracy of SOTA offline GPSSMs with only 1/100 of the runtime, and surpasses SOTA online GPSSMs by around 70% in accuracy under heavy noise while using only 1/20 of the runtime.


Particle Dynamics for Latent-Variable Energy-Based Models

arXiv.org Machine Learning

Latent-variable energy-based models (LV-EBMs) assign a single normalized energy to joint pairs of observed data and latent variables, offering expressive generative modeling while capturing hidden structure. We recast maximum-likelihood training as a saddle problem over distributions on the latent and joint manifolds and view the inner updates as coupled Wasserstein gradient flows. The resulting algorithm alternates overdamped Langevin updates for a joint negative pool and for conditional latent particles with stochastic parameter ascent, requiring no discriminator or auxiliary networks. We prove existence and convergence under standard smoothness and dissi-pativity assumptions, with decay rates in KL divergence and Wasserstein-2 distance. The saddle-point view further yields an ELBO strictly tighter than bounds obtained with restricted amortized posteriors. Our method is evaluated on numerical approximations of physical systems and performs competitively against comparable approaches.


Clarifying the Ti-V Phase Diagram Using First-Principles Calculations and Bayesian Learning

arXiv.org Artificial Intelligence

Conflicting experiments disagree on whether the titanium-vanadium (Ti-V) binary alloy exhibits a body-centred cubic (BCC) miscibility gap or remains completely soluble. A leading hypothesis attributes the miscibility gap to oxygen contamination during alloy preparation. To resolve this disagreement, we use an ab initio + machine-learning workflow that couples an actively-trained Moment Tensor Potential with Bayesian inference of free energy surface. This workflow enables construction of the Ti-V phase diagram across the full composition range with systematically reduced statistical and finite-size errors. The resulting diagram reproduces all experimental features, demonstrating the robustness of our approach, and clearly favors the variant with a BCC miscibility gap terminating at T = 980 K and c = 0.67. Because our simulations model a perfectly oxygen-free Ti-V system, the observed gap cannot originate from impurity effects, in contrast to recent CALPHAD reassessments.


Bayesian Inference for PDE-based Inverse Problems using the Optimization of a Discrete Loss

arXiv.org Artificial Intelligence

Inverse problems are ubiquitous in science, engineering, and medicine, in particular for problems where observations provide only indirect or incomplete information about a system [1]. Inverse problems are central in a wide range of applications such as flow field reconstruction [2, 3, 4], data assimilation [5], medical imaging [6, 7], and parameters estimation of material properties [8, 9, 10]. A particularly challenging class of inverse problems arises when the forward model is governed by ordinary differential equations (ODEs) or partial differential equations (PDEs) [11]. Incorporating physical knowledge through this approach reduces the space of possible solutions, avoiding the need for arbitrary regularization as is often the case in inverse problems [12, 13, 14]. However, this approach can suffer from the high dimensionality of the problem, stiffness, noisy measurements, and sensitivity to parameters. In particular, quantifying the uncertainties of solutions is challenging with standard techniques for inverse PDE problems such as Bayesian inference [15, 14], variational methods [16], ensemble Kalman methods [17], and adjoint-based optimization [18], which can be limited with issues of scalability, robustness, and computational cost. In parallel, operator learning approaches based on DeepONets [19], Fourier neural operators [20], and graph neural networks [21, 22] have been extended to inverse problems and uncertainty quantification [23, 24, 25]. Similar Bayesian techniques rely on training data to build prior knowledge [26]. However, the application of these operator learning techniques to large-scale problems is limited by the cost of their training and the difficulty of generating sufficient high-fidelity data.


ASBI: Leveraging Informative Real-World Data for Active Black-Box Simulator Tuning

arXiv.org Artificial Intelligence

Black-box simulators are widely used in robotics, but optimizing their parameters remains challenging due to inaccessible likelihoods. Simulation-Based Inference (SBI) tackles this issue using simulation-driven approaches, estimating the posterior from offline real observations and forward simulations. However, in black-box scenarios, preparing observations that contain sufficient information for parameter estimation is difficult due to the unknown relationship between parameters and observations. In this work, we present Active Simulation-Based Inference (ASBI), a parameter estimation framework that uses robots to actively collect real-world online data to achieve accurate black-box simulator tuning. Our framework optimizes robot actions to collect informative observations by maximizing information gain, which is defined as the expected reduction in Shannon entropy between the posterior and the prior. While calculating information gain requires the likelihood, which is inaccessible in black-box simulators, our method solves this problem by leveraging Neural Posterior Estimation (NPE), which leverages a neural network to learn the posterior estimator. Three simulation experiments quantitatively verify that our method achieves accurate parameter estimation, with posteriors sharply concentrated around the true parameters. Moreover, we show a practical application using a real robot to estimate the simulation parameters of cubic particles corresponding to two real objects, beads and gravel, with a bucket pouring action.


A simple mean field model of feature learning

arXiv.org Artificial Intelligence

Feature learning (FL), where neural networks adapt their internal representations during training, remains poorly understood. Using methods from statistical physics, we derive a tractable, self-consistent mean-field (MF) theory for the Bayesian posterior of two-layer non-linear networks trained with stochastic gradient Langevin dynamics (SGLD). At infinite width, this theory reduces to kernel ridge regression, but at finite width it predicts a symmetry breaking phase transition where networks abruptly align with target functions. While the basic MF theory provides theoretical insight into the emergence of FL in the finite-width regime, semi-quantitatively predicting the onset of FL with noise or sample size, it substantially underestimates the improvements in generalisation after the transition. We trace this discrepancy to a key mechanism absent from the plain MF description: \textit{self-reinforcing input feature selection}. Incorporating this mechanism into the MF theory allows us to quantitatively match the learning curves of SGLD-trained networks and provides mechanistic insight into FL.


Towards Error Centric Intelligence I, Beyond Observational Learning

arXiv.org Artificial Intelligence

We argue that progress toward AGI is theory limited rather than data or scale limited. Building on the critical rationalism of Popper and Deutsch, we challenge the Platonic Representation Hypothesis. Observationally equivalent worlds can diverge under interventions, so observational adequacy alone cannot guarantee interventional competence. We begin by laying foundations, definitions of knowledge, learning, intelligence, counterfactual competence and AGI, and then analyze the limits of observational learning that motivate an error centric shift. We recast the problem as three questions about how explicit and implicit errors evolve under an agent's actions, which errors are unreachable within a fixed hypothesis space, and how conjecture and criticism expand that space. From these questions we propose Causal Mechanics, a mechanisms first program in which hypothesis space change is a first class operation and probabilistic structure is used when useful rather than presumed. We advance structural principles that make error discovery and correction tractable, including a differential Locality and Autonomy Principle for modular interventions, a gauge invariant form of Independent Causal Mechanisms for separability, and the Compositional Autonomy Principle for analogy preservation, together with actionable diagnostics. The aim is a scaffold for systems that can convert unreachable errors into reachable ones and correct them.


OpenEstimate: Evaluating LLMs on Reasoning Under Uncertainty with Real-World Data

arXiv.org Artificial Intelligence

Real-world settings where language models (LMs) are deployed -- in domains spanning healthcare, finance, and other forms of knowledge work -- require models to grapple with incomplete information and reason under uncertainty. Yet most LM evaluations focus on problems with well-defined answers and success criteria. This gap exists in part because natural problems involving uncertainty are difficult to construct: given that LMs have access to most of the same knowledge as humans, it is non-trivial to design questions for which LMs will struggle to produce correct answers, but which humans can answer reliably. As a result, LM performance on reasoning under uncertainty remains poorly characterized. To address this gap, we introduce OpenEstimate, an extensible, multi-domain benchmark for evaluating LMs on numerical estimation tasks that require models to synthesize significant amounts of background information and express predictions as probabilistic priors. We assess these priors for accuracy and calibration, quantifying their usefulness relative to samples from the true distribution of interest. Across six frontier LMs, we find that LM-elicited priors are often inaccurate and overconfident. Performance improves modestly depending on how uncertainty is elicited from the model, but is largely unaffected by changes in sampling strategy, reasoning effort, or prompt design. The OpenEstimate benchmark thus offers a challenging evaluation for frontier LMs and a platform for developing models that are better at probabilistic estimation and reasoning under uncertainty.


Learning to Answer from Correct Demonstrations

arXiv.org Machine Learning

We study the problem of learning to generate an answer (or completion) to a question (or prompt), where there could be multiple correct answers, any one of which is acceptable at test time. Learning is based on demonstrations of some correct answer to each training question, as in Supervised Fine Tuning (SFT). We formalize the problem as offline imitation learning in contextual bandits, with demonstrations from some optimal policy, without explicitly observed rewards. Prior work assumes that the demonstrator belongs to a low-complexity policy class, which motivates maximum likelihood estimation (i.e., log-loss minimization). In contrast, we propose relying only on the reward model (specifying which answers are correct) being in a low-cardinality class, which we argue is a weaker assumption. We show that likelihood maximization methods can fail in this case, and instead devise an alternative novel approach that learns with sample complexity logarithmic in the cardinality of the reward class. Our work motivates looking beyond likelihood maximization when learning from correct demonstrations.