Goto

Collaborating Authors

 Bayesian Learning


Sequential Manipulation Against Rank Aggregation: Theory and Algorithm

arXiv.org Artificial Intelligence

Rank aggregation with pairwise comparisons is widely encountered in sociology, politics, economics, psychology, sports, etc . Given the enormous social impact and the consequent incentives, the potential adversary has a strong motivation to manipulate the ranking list. However, the ideal attack opportunity and the excessive adversarial capability cause the existing methods to be impractical. To fully explore the potential risks, we leverage an online attack on the vulnerable data collection process. Since it is independent of rank aggregation and lacks effective protection mechanisms, we disrupt the data collection process by fabricating pairwise comparisons without knowledge of the future data or the true distribution. From the game-theoretic perspective, the confrontation scenario between the online manipulator and the ranker who takes control of the original data source is formulated as a distributionally robust game that deals with the uncertainty of knowledge. Then we demonstrate that the equilibrium in the above game is potentially favorable to the adversary by analyzing the vulnerability of the sampling algorithms such as Bernoulli and reservoir methods. According to the above theoretical analysis, different sequential manipulation policies are proposed under a Bayesian decision framework and a large class of parametric pairwise comparison models. For attackers with complete knowledge, we establish the asymptotic optimality of the proposed policies. To increase the success rate of the sequential manipulation with incomplete knowledge, a distributionally robust estimator, which replaces the maximum likelihood estimation in a saddle point problem, provides a conservative data generation solution. Finally, the corroborating empirical evidence shows that the proposed method manipulates the results of rank aggregation methods in a sequential manner.


Predicting Trust Dynamics with Dynamic SEM in Human-AI Cooperation

arXiv.org Artificial Intelligence

Humans' trust in AI constitutes a pivotal element in fostering a synergistic relationship between humans and AI. This is particularly significant in the context of systems that leverage AI technology, such as autonomous driving systems and human-robot interaction. Trust facilitates appropriate utilization of these systems, thereby optimizing their potential benefits. If humans over-trust or under-trust an AI, serious problems such as misuse and accidents occur. To prevent over/under-trust, it is necessary to predict trust dynamics. However, trust is an internal state of humans and hard to directly observe. Therefore, we propose a prediction model for trust dynamics using dynamic structure equation modeling, which extends SEM that can handle time-series data. A path diagram, which shows causalities between variables, is developed in an exploratory way and the resultant path diagram is optimized for effective path structures. Over/under-trust was predicted with 90\% accuracy in a drone simulator task,, and it was predicted with 99\% accuracy in an autonomous driving task. These results show that our proposed method outperformed the conventional method including an auto regression family.


Adaptive RKHS Fourier Features for Compositional Gaussian Process Models

arXiv.org Machine Learning

Gaussian Processes (GPs) provide a principled Bayesian framework for function approximation, making them particularly useful in many applications requiring uncertainty calibration [Rasmussen and Williams, 2006], such as Bayesian optimisation [Snoek et al., 2012] and time-series analysis [Roberts et al., 2013]. Despite offering reasonable uncertainty estimation, shallow GPs often struggle to model complex, non-stationary processes present in practical applications. To overcome this limitation, Deep Gaussian Processes (DGPs) employ a compositional architecture by stacking multiple GP layers, thereby enhancing representational power while preserving the model's intrinsic capability to quantify uncertainty [Damianou and Lawrence, 2013]. However, the conventional variational formulation of DGPs heavily depends on local inducing point approximations across intermediate GP layers [Titsias, 2009, Salimbeni and Deisenroth, 2017], which hinder the model from capturing the global structures commonly found in real-world scenarios. Incorporating Fourier features into GP models has shown promise in addressing this challenge in GP inference due to the periodic nature of these features. A line of research uses Random Fourier Features (RFFs, [Rahimi and Recht, 2007]) of stationary kernels to convert the original (deep) GPs into Bayesian networks in weight space [Lรกzaro-Gredilla et al., 2010, Gal and Turner, 2015, Cutajar et al., 2017]. Building on this concept within a sparse variational GP framework, recent advancements in inter-domain GPs [Lรกzaro-Gredilla and Figueiras-Vidal, 2009a, Van der Wilk et al., 2020] directly approximate the posterior of the original GPs by introducing fixed Variational Fourier Features (VFFs) through process projection onto a Reproducing Kernel Hilbert Space (RKHS)[Hensman et al., 2018, Rudner et al., 2020]. VFFs are derived by projecting GPs onto a different domain.


Using Voice and Biofeedback to Predict User Engagement during Product Feedback Interviews

arXiv.org Artificial Intelligence

Capturing users' engagement is crucial for gathering feedback about the features of a software product. In a market-driven context, current approaches to collect and analyze users' feedback are based on techniques leveraging information extracted from product reviews and social media. These approaches are hardly applicable in bespoke software development, or in contexts in which one needs to gather information from specific users. In such cases, companies need to resort to face-to-face interviews to get feedback on their products. In this paper, we propose to utilize biometric data, in terms of physiological and voice features, to complement interviews with information about the engagement of the user on the discussed product-relevant topics. We evaluate our approach by interviewing users while gathering their physiological data (i.e., biofeedback) using an Empatica E4 wristband, and capturing their voice through the default audio-recorder of a common laptop. Our results show that we can predict users' engagement by training supervised machine learning algorithms on biometric data (F1=0.72), and that voice features alone are sufficiently effective (F1=0.71). Our work contributes with one the first studies in requirements engineering in which biometrics are used to identify emotions. This is also the first study in software engineering that considers voice analysis. The usage of voice features could be particularly helpful for emotion-aware requirements elicitation in remote communication, either performed by human analysts or voice-based chatbots, and can also be exploited to support the analysis of meetings in software engineering research.


Neural Conditional Probability for Inference

arXiv.org Machine Learning

We introduce NCP (Neural Conditional Probability), a novel operator-theoretic approach for learning conditional distributions with a particular focus on inference tasks. NCP can be used to build conditional confidence regions and extract important statistics like conditional quantiles, mean, and covariance. It offers streamlined learning through a single unconditional training phase, facilitating efficient inference without the need for retraining even when conditioning changes. By tapping into the powerful approximation capabilities of neural networks, our method efficiently handles a wide variety of complex probability distributions, effectively dealing with nonlinear relationships between input and output variables. Theoretical guarantees ensure both optimization consistency and statistical accuracy of the NCP method. Our experiments show that our approach matches or beats leading methods using a simple Multi-Layer Perceptron (MLP) with two hidden layers and GELU activations. This demonstrates that a minimalistic architecture with a theoretically grounded loss function can achieve competitive results without sacrificing performance, even in the face of more complex architectures.


Energy-based Epistemic Uncertainty for Graph Neural Networks

arXiv.org Machine Learning

In domains with interdependent data, such as graphs, quantifying the epistemic uncertainty of a Graph Neural Network (GNN) is challenging as uncertainty can arise at different structural scales. Existing techniques neglect this issue or only distinguish between structure-aware and structure-agnostic uncertainty without combining them into a single measure. We propose GEBM, an energy-based model (EBM) that provides high-quality uncertainty estimates by aggregating energy at different structural levels that naturally arise from graph diffusion. In contrast to logit-based EBMs, we provably induce an integrable density in the data space by regularizing the energy function. We introduce an evidential interpretation of our EBM that significantly improves the predictive robustness of the GNN. Our framework is a simple and effective post hoc method applicable to any pre-trained GNN that is sensitive to various distribution shifts. It consistently achieves the best separation of in-distribution and out-of-distribution data on 6 out of 7 anomaly types while having the best average rank over shifts on \emph{all} datasets.


Statistical signatures of abstraction in deep neural networks

arXiv.org Machine Learning

We study how abstract representations emerge in a Deep Belief Network (DBN) trained on benchmark datasets. Our analysis targets the principles of learning in the early stages of information processing, starting from the "primordial soup" of the under-sampling regime. As the data is processed by deeper and deeper layers, features are detected and removed, transferring more and more "context-invariant" information to deeper layers. We show that the representation approaches an universal model -- the Hierarchical Feature Model (HFM) -- determined by the principle of maximal relevance. Relevance quantifies the uncertainty on the model of the data, thus suggesting that "meaning" -- i.e. syntactic information -- is that part of the data which is not yet captured by a model. Our analysis shows that shallow layers are well described by pairwise Ising models, which provide a representation of the data in terms of generic, low order features. We also show that plasticity increases with depth, in a similar way as it does in the brain. These findings suggest that DBNs are capable of extracting a hierarchy of features from the data which is consistent with the principle of maximal relevance.


Bayesian Entropy Neural Networks for Physics-Aware Prediction

arXiv.org Machine Learning

This paper addresses the need for deep learning models to integrate well-defined constraints into their outputs, driven by their application in surrogate models, learning with limited data and partial information, and scenarios requiring flexible model behavior to incorporate non-data sample information. We introduce Bayesian Entropy Neural Networks (BENN), a framework grounded in Maximum Entropy (MaxEnt) principles, designed to impose constraints on Bayesian Neural Network (BNN) predictions. BENN is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output. To achieve simultaneous uncertainty quantification and constraint satisfaction, we employ the method of multipliers approach. This allows for the concurrent estimation of neural network parameters and the Lagrangian multipliers associated with the constraints. Our experiments, spanning diverse applications such as beam deflection modeling and microstructure generation, demonstrate the effectiveness of BENN. The results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods.


Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

arXiv.org Artificial Intelligence

Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a small subspace (e.g., valid protein sequences), which is common in real-world tasks. Recently, diffusion models have demonstrated impressive capability in learning the high-dimensional data manifold. They have shown promising performance in black-box optimization tasks but only in offline settings. In this work, we propose diffusion-based inverse modeling for black-box optimization (Diff-BBO), the first inverse approach leveraging diffusion models for online BBO problem. Diff-BBO distinguishes itself from forward approaches through the design of acquisition function. Instead of proposing candidates in the design space, Diff-BBO employs a novel acquisition function Uncertainty-aware Exploration (UaE) to propose objective function values, which leverages the uncertainty of a conditional diffusion model to generate samples in the design space. Theoretically, we prove that using UaE leads to optimal optimization outcomes. Empirically, we redesign experiments on the Design-Bench benchmark for online settings and show that Diff-BBO achieves state-of-the-art performance.


DADEE: Well-calibrated uncertainty quantification in neural networks for barriers-based robot safety

arXiv.org Artificial Intelligence

Uncertainty-aware controllers that guarantee safety are critical for safety critical applications. Among such controllers, Control Barrier Functions (CBFs) based approaches are popular because they are fast, yet safe. However, most such works depend on Gaussian Processes (GPs) or MC-Dropout for learning and uncertainty estimation, and both approaches come with drawbacks: GPs are non-parametric methods that are slow, while MC-Dropout does not capture aleatoric uncertainty. On the other hand, modern Bayesian learning algorithms have shown promise in uncertainty quantification. The application of modern Bayesian learning methods to CBF-based controllers has not yet been studied. We aim to fill this gap by surveying uncertainty quantification algorithms and evaluating them on CBF-based safe controllers. We find that model variance-based algorithms (for example, Deep ensembles, MC-dropout, etc.) and direct estimation-based algorithms (such as DEUP) have complementary strengths. Algorithms in the former category can only estimate uncertainty accurately out-of-domain, while those in the latter category can only do so in-domain. We combine the two approaches to obtain more accurate uncertainty estimates both in- and out-of-domain. As measured by the failure rate of a simulated robot, this results in a safer CBF-based robot controller.