Goto

Collaborating Authors

 Uncertainty


Latent Adaptive Planner for Dynamic Manipulation

arXiv.org Artificial Intelligence

We present the Latent Adaptive Planner (LAP), a trajectory-level latent-variable policy for dynamic nonprehensile manipulation (e.g., box catching) that formulates planning as inference in a low-dimensional latent space and is learned effectively from human demonstration videos. During execution, LAP achieves real-time adaptation by maintaining a posterior over the latent plan and performing variational replanning as new observations arrive. To bridge the embodiment gap between humans and robots, we introduce a model-based proportional mapping that regenerates accurate kinematic-dynamic joint states and object positions from human demonstrations. Through challenging box catching experiments with varying object properties, LAP demonstrates superior success rates, trajectory smoothness, and energy efficiency by learning human-like compliant motions and adaptive behaviors. Overall, LAP enables dynamic manipulation with real-time adaptation and successfully transfer across heterogeneous robot platforms using the same human demonstration videos.


Learning Diffusion Policies for Robotic Manipulation of Timber Joinery under Fabrication Uncertainty

arXiv.org Artificial Intelligence

Construction uncertainties such as fabrication inaccuracies and material imperfections pose a significant challenge to contact-rich robotic manipulation by hindering precise and robust assembly. In this paper, we explore the performance and robustness of diffusion policy learning as a promising solution for contact-sensitive robotic assembly at construction scale, using timber mortise and tenon joints as a case study. A two-phase study is conducted: first, to evaluate policy performance and applicability; second, to assess robustness in handling fabrication uncertainties simulated as randomized perturbations to the mortise position. The best-performing policy achieved a total average success rate of 75% with perturbations up to 10 mm, including 100% success in unperturbed cases. The results demonstrate the potential of sensory-motor diffusion policies to generalize to a wide range of complex, contact-rich assembly tasks across construction and manufacturing, advancing robotic construction under uncertainty and contributing to safer, more efficient building practices.


DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing

arXiv.org Machine Learning

From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embedding the likelihood with the prior to guide the sampling process. However, this formulation fails to explain its practical behavior: the prior offers limited guidance, while reconstruction is largely driven by the measurement-consistency term, leading to an inference process that is effectively decoupled from the diffusion dynamics. To clarify this structure, we reinterpret the role of diffusion in inverse problem solving as an initialization stage within an expectation--maximization (EM)--style framework, where the diffusion stage and the data-driven refinement are fully decoupled. We introduce \textbf{DAPS++}, which allows the likelihood term to guide inference more directly while maintaining numerical stability and providing insight into why unified diffusion trajectories remain effective in practice. By requiring fewer function evaluations (NFEs) and measurement-optimization steps, \textbf{DAPS++} achieves high computational efficiency and robust reconstruction performance across diverse image restoration tasks.


Feasibility of Embodied Dynamics Based Bayesian Learning for Continuous Pursuit Motion Control of Assistive Mobile Robots in the Built Environment

arXiv.org Artificial Intelligence

Non-invasive electroencephalography (EEG)-based brain-computer interfaces (BCIs) offer an intuitive means for individuals with severe motor impairments to independently operate assistive robotic wheelchairs and navigate built environments. Despite considerable progress in BCI research, most current motion control systems are limited to discrete commands, rather than supporting continuous pursuit, where users can freely adjust speed and direction in real time. Such natural mobility control is, however, essential for wheelchair users to navigate complex public spaces, such as transit stations, airports, hospitals, and indoor corridors, to interact socially with the dynamic populations with agility, and to move flexibly and comfortably as autonomous driving is refined to allow movement at will. In this study, we address the gap of continuous pursuit motion control in BCIs by proposing and validating a brain-inspired Bayesian inference framework, where embodied dynamics in acceleration-based motor representations are decoded. This approach contrasts with conventional kinematics-level decoding and deep learning-based methods. Using a public dataset with sixteen hours of EEG from four subjects performing motor imagery-based target-following, we demonstrate that our method, utilizing Automatic Relevance Determination for feature selection and continual online learning, reduces the normalized mean squared error between predicted and true velocities by 72% compared to autoregressive and EEGNet-based methods in a session-accumulative transfer learning setting. Theoretically, these findings empirically support embodied cognition theory and reveal the brain's intrinsic motor control dynamics in an embodied and predictive nature. Practically, grounding EEG decoding in the same dynamical principles that govern biological motion offers a promising path toward more stable and intuitive BCI control.


ReBaPL: Repulsive Bayesian Prompt Learning

arXiv.org Artificial Intelligence

Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.


An Efficient Computational Framework for Discrete Fuzzy Numbers Based on Total Orders

arXiv.org Artificial Intelligence

Discrete fuzzy numbers, and in particular those defined over a finite chain $L_n = \{0, \ldots, n\}$, have been effectively employed to represent linguistic information within the framework of fuzzy systems. Research on total (admissible) orderings of such types of fuzzy subsets, and specifically those belonging to the set $\mathcal{D}_1^{L_n\rightarrow Y_m}$ consisting of discrete fuzzy numbers $A$ whose support is a closed subinterval of the finite chain $L_n = \{0, 1, \ldots, n\}$ and whose membership values $A(x)$, for $x \in L_n$, belong to the set $Y_m = \{ 0 = y_1 < y_2 < \cdots < y_{m-1} < y_m = 1 \}$, has facilitated the development of new methods for constructing logical connectives, based on a bijective function, called $\textit{pos function}$, that determines the position of each $A \in \mathcal{D}_1^{L_n\rightarrow Y_m}$. For this reason, in this work we revisit the problem by introducing algorithms that exploit the combinatorial structure of total (admissible) orders to compute the $\textit{pos}$ function and its inverse with exactness. The proposed approach achieves a complexity of $\mathcal{O}(n^{2} m \log n)$, which is quadratic in the size of the underlying chain ($n$) and linear in the number of membership levels ($m$). The key point is that the dominant factor is $m$, ensuring scalability with respect to the granularity of membership values. The results demonstrate that this formulation substantially reduces computational cost and enables the efficient implementation of algebraic operations -- such as aggregation and implication -- on the set of discrete fuzzy numbers.


Patient-level Information Extraction by Consistent Integration of Textual and Tabular Evidence with Bayesian Networks

arXiv.org Artificial Intelligence

Electronic health records (EHRs) form an invaluable resource for training clinical decision support systems. To leverage the potential of such systems in high-risk applications, we need large, structured tabular datasets on which we can build transparent feature-based models. While part of the EHR already contains structured information (e.g. diagnosis codes, medications, and lab results), much of the information is contained within unstructured text (e.g. discharge summaries and nursing notes). In this work, we propose a method for multi-modal patient-level information extraction that leverages both the tabular features available in the patient's EHR (using an expert-informed Bayesian network) as well as clinical notes describing the patient's symptoms (using neural text classifiers). We propose the use of virtual evidence augmented with a consistency node to provide an interpretable, probabilistic fusion of the models' predictions. The consistency node improves the calibration of the final predictions compared to virtual evidence alone, allowing the Bayesian network to better adjust the neural classifier's output to handle missing information and resolve contradictions between the tabular and text data. We show the potential of our method on the SimSUM dataset, a simulated benchmark linking tabular EHRs with clinical notes through expert knowledge.


Comparing verbal, visual and combined explanations for Bayesian Network inferences

arXiv.org Artificial Intelligence

Bayesian Networks (BNs) are an important tool for assisting probabilistic reasoning, but despite being considered transparent models, people have trouble understanding them. Further, current User Interfaces (UIs) still do not clarify the reasoning of BNs. To address this problem, we have designed verbal and visual extensions to the standard BN UI, which can guide users through common inference patterns. We conducted a user study to compare our verbal, visual and combined UI extensions, and a baseline UI. Our main findings are: (1) users did better with all three types of extensions than with the baseline UI for questions about the impact of an observation, the paths that enable this impact, and the way in which an observation influences the impact of other observations; and (2) using verbal and visual modalities together is better than using either modality alone for some of these question types.


Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding

arXiv.org Artificial Intelligence

How do Large Language Models understand moral dimensions compared to humans? This first large-scale Bayesian evaluation of market-leading language models provides the answer. In contrast to prior work using deterministic ground truth (majority or inclusion rules), we model annotator disagreements to capture both aleatoric uncertainty (inherent human disagreement) and epistemic uncertainty (model domain sensitivity). We evaluated the best language models (Claude Sonnet 4, DeepSeek-V3, Llama 4 Maverick) across 250K+ annotations from nearly 700 annotators in 100K+ texts spanning social networks, news and forums. Our GPU-optimized Bayesian framework processed 1M+ model queries, revealing that AI models typically rank among the top 25\% of human annotators, performing much better than average balanced accuracy. Importantly, we find that AI produces far fewer false negatives than humans, highlighting their more sensitive moral detection capabilities.


Estimating Bidirectional Causal Effects with Large Scale Online Kernel Learning

arXiv.org Machine Learning

In this study, a scalable online kernel learning framework is proposed for estimating bidirectional causal effects in systems characterized by mutual dependence and heteroskedasticity. Traditional causal inference often focuses on unidirectional effects, overlooking the common bidirectional relationships in real-world phenomena. Building on heteroskedasticity-based identification, the proposed method integrates a quasi-maximum likelihood estimator for simultaneous equation models with large scale online kernel learning. It employs random Fourier feature approximations to flexibly model nonlinear conditional means and variances, while an adaptive online gradient descent algorithm ensures computational efficiency for streaming and high-dimensional data. Results from extensive simulations demonstrate that the proposed method achieves superior accuracy and stability than single equation and polynomial approximation baselines, exhibiting lower bias and root mean squared error across various data-generating processes. These results confirm that the proposed approach effectively captures complex bidirectional causal effects with near-linear computational scaling. By combining econometric identification with modern machine learning techniques, the proposed framework offers a practical, scalable, and theoretically grounded solution for large scale causal inference in natural/social science, policy making, business, and industrial applications.