Bayesian Inference
Efficient Membership Inference Attacks by Bayesian Neural Network
Liu, Zhenlong, Jiang, Wenyu, Zhou, Feng, Wei, Hongxin
Membership Inference Attacks (MIAs) aim to estimate whether a specific data point was used in the training of a given model. Previous attacks often utilize multiple reference models to approximate the conditional score distribution, leading to significant computational overhead. While recent work leverages quantile regression to estimate conditional thresholds, it fails to capture epistemic uncertainty, resulting in bias in low-density regions. In this work, we propose a novel approach - Bayesian Membership Inference Attack (BMIA), which performs conditional attack through Bayesian inference. In particular, we transform a trained reference model into Bayesian neural networks by Laplace approximation, enabling the direct estimation of the conditional score distribution by probabilistic model parameters. Our method addresses both epistemic and aleatoric uncertainty with only a reference model, enabling efficient and powerful MIA. Extensive experiments on five datasets demonstrate the effectiveness and efficiency of BMIA.
Learning and planning for optimal synergistic human-robot coordination in manufacturing contexts
Sandrini, Samuele, Faroni, Marco, Pedrocchi, Nicola
Collaborative robotics cells leverage heterogeneous agents to provide agile production solutions. Effective coordination is essential to prevent inefficiencies and risks for human operators working alongside robots. This paper proposes a human-aware task allocation and scheduling model based on Mixed Integer Nonlinear Programming to optimize efficiency and safety starting from task planning stages. The approach exploits synergies that encode the coupling effects between pairs of tasks executed in parallel by the agents, arising from the safety constraints imposed on robot agents. These terms are learned from previous executions using a Bayesian estimation; the inference of the posterior probability distribution of the synergy coefficients is performed using the Markov Chain Monte Carlo method. The synergy enhances task planning by adapting the nominal duration of the plan according to the effect of the operator's presence. Simulations and experimental results demonstrate that the proposed method produces improved human-aware task plans, reducing unuseful interference between agents, increasing human-robot distance, and achieving up to an 18\% reduction in process execution time.
How Well Can Differential Privacy Be Audited in One Run?
Keinan, Amit, Shenfeld, Moshe, Ligett, Katrina
Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that one-run auditing can only perfectly uncover the true privacy parameters of algorithms whose structure allows the effects of individual data elements to be isolated. Our characterization helps reveal how and when one-run auditing is still a promising technique for auditing real machine learning algorithms, despite these fundamental gaps.
Uncertainty quantification and posterior sampling for network reconstruction
Network reconstruction is the task of inferring the unseen interactions between elements of a system, based only on their behavior or dynamics. This inverse problem is in general ill-posed, and admits many solutions for the same observation. Nevertheless, the vast majority of statistical methods proposed for this task -- formulated as the inference of a graphical generative model -- can only produce a ``point estimate,'' i.e. a single network considered the most likely. In general, this can give only a limited characterization of the reconstruction, since uncertainties and competing answers cannot be conveyed, even if their probabilities are comparable, while being structurally different. In this work we present an efficient MCMC algorithm for sampling from posterior distributions of reconstructed networks, which is able to reveal the full population of answers for a given reconstruction problem, weighted according to their plausibilities. Our algorithm is general, since it does not rely on specific properties of particular generative models, and is specially suited for the inference of large and sparse networks, since in this case an iteration can be performed in time $O(N\log^2 N)$ for a network of $N$ nodes, instead of $O(N^2)$, as would be the case for a more naive approach. We demonstrate the suitability of our method in providing uncertainties and consensus of solutions (which provably increases the reconstruction accuracy) in a variety of synthetic and empirical cases.
Sequential Function-Space Variational Inference via Gaussian Mixture Approximation
Zhu, Menghao Waiyan William, Hao, Pengcheng, Kuruoฤlu, Ercan Engin
Continual learning is learning from a sequence of tasks with the aim of learning new tasks without forgetting old tasks. Sequential function-space variational inference (SFSVI) is a continual learning method based on variational inference which uses a Gaussian variational distribution to approximate the distribution of the outputs of a finite number of selected inducing points. Since the posterior distribution of a neural network is multi-modal, a Gaussian distribution could only match one mode of the posterior distribution, and a Gaussian mixture distribution could be used to better approximate the posterior distribution. We propose an SFSVI method which uses a Gaussian mixture variational distribution. We also compare different types of variational inference methods with and without a fixed pre-trained feature extractor. We find that in terms of final average accuracy, Gaussian mixture methods perform better than Gaussian methods and likelihood-focused methods perform better than prior-focused methods.
Learning Energy-Based Models by Self-normalising the Likelihood
Senetaire, Hugo, Jeha, Paul, Mattei, Pierre-Alexandre, Frellsen, Jes
Training an energy-based model (EBM) with maximum likelihood is challenging due to the intractable normalisation constant. Traditional methods rely on expensive Markov chain Monte Carlo (MCMC) sampling to estimate the gradient of logartihm of the normalisation constant. We propose a novel objective called self-normalised log-likelihood (SNL) that introduces a single additional learnable parameter representing the normalisation constant compared to the regular log-likelihood. SNL is a lower bound of the log-likelihood, and its optimum corresponds to both the maximum likelihood estimate of the model parameters and the normalisation constant. We show that the SNL objective is concave in the model parameters for exponential family distributions. Unlike the regular log-likelihood, the SNL can be directly optimised using stochastic gradient techniques by sampling from a crude proposal distribution. We validate the effectiveness of our proposed method on various density estimation tasks as well as EBMs for regression. Our results show that the proposed method, while simpler to implement and tune, outperforms existing techniques.
Personalized Class Incremental Context-Aware Food Classification for Food Intake Monitoring Systems
Tehrani, Hassan Kazemi, Cai, Jun, Yekanlou, Abbas, Santosa, Sylvia
Accurate food intake monitoring is crucial for maintaining a healthy diet and preventing nutrition-related diseases. With the diverse range of foods consumed across various cultures, classic food classification models have limitations due to their reliance on fixed-sized food datasets. Studies show that people consume only a small range of foods across the existing ones, each consuming a unique set of foods. Existing class-incremental models have low accuracy for the new classes and lack personalization. This paper introduces a personalized, class-incremental food classification model designed to overcome these challenges and improve the performance of food intake monitoring systems. Our approach adapts itself to the new array of food classes, maintaining applicability and accuracy, both for new and existing classes by using personalization. Our model's primary focus is personalization, which improves classification accuracy by prioritizing a subset of foods based on an individual's eating habits, including meal frequency, times, and locations. A modified version of DSN is utilized to expand on the appearance of new food classes. Additionally, we propose a comprehensive framework that integrates this model into a food intake monitoring system. This system analyzes meal images provided by users, makes use of a smart scale to estimate food weight, utilizes a nutrient content database to calculate the amount of each macro-nutrient, and creates a dietary user profile through a mobile application. Finally, experimental evaluations on two new benchmark datasets FOOD101-Personal and VFN-Personal, personalized versions of well-known datasets for food classification, are conducted to demonstrate the effectiveness of our model in improving the classification accuracy of both new and existing classes, addressing the limitations of both conventional and class-incremental food classification models.
BTFL: A Bayesian-based Test-Time Generalization Method for Internal and External Data Distributions in Federated learning
Federated Learning (FL) enables multiple clients to collaboratively develop a global model while maintaining data privacy. However, online FL deployment faces challenges due to distribution shifts and evolving test samples. Personalized Federated Learning (PFL) tailors the global model to individual client distributions, but struggles with Out-Of-Distribution (OOD) samples during testing, leading to performance degradation. In real-world scenarios, balancing personalization and generalization during online testing is crucial and existing methods primarily focus on training-phase generalization. To address the test-time trade-off, we introduce a new scenario: Test-time Generalization for Internal and External Distributions in Federated Learning (TGFL), which evaluates adaptability under Internal Distribution (IND) and External Distribution (EXD). We propose BTFL, a Bayesian-based test-time generalization method for TGFL, which balances generalization and personalization at the sample level during testing. BTFL employs a two-head architecture to store local and global knowledge, interpolating predictions via a dual-Bayesian framework that considers both historical test data and current sample characteristics with theoretical guarantee and faster speed. Our experiments demonstrate that BTFL achieves improved performance across various datasets and models with less time cost. The source codes are made publicly available at https://github.com/ZhouYuCS/BTFL .
Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model
Xu, Jinwen, Lu, Qin, Bar-Shalom, Yaakov
This paper deals with the identification of the stochastic Ornstein-Uhlenbeck (OU) process error model, which is characterized by an inverse time constant, and the unknown variances of the process and observation noises. Although the availability of the explicit expression of the log-likelihood function allows one to obtain the maximum likelihood estimator (MLE), this entails evaluating the nontrivial gradient and also often struggles with local optima. To address these limitations, we put forth a sample-efficient global optimization approach based on the Bayesian optimization (BO) framework, which relies on a Gaussian process (GP) surrogate model for the objective function that effectively balances exploration and exploitation to select the query points. Specifically, each evaluation of the objective is implemented efficiently through the Kalman filter (KF) recursion. Comprehensive experiments on various parameter settings and sampling intervals corroborate that BO-based estimator consistently outperforms MLE implemented by the steady-state KF approximation and the expectation-maximization algorithm (whose derivation is a side contribution) in terms of root mean-square error (RMSE) and statistical consistency, confirming the effectiveness and robustness of the BO for identification of the stochastic OU process. Notably, the RMSE values produced by the BO-based estimator are smaller than the classical Cram\'{e}r-Rao lower bound, especially for the inverse time constant, estimating which has been a long-standing challenge. This seemingly counterintuitive result can be explained by the data-driven prior for the learning parameters indirectly injected by BO through the GP prior over the objective function.
Causality Enhanced Origin-Destination Flow Prediction in Data-Scarce Cities
Feng, Tao, Zhang, Yunke, Wang, Huandong, Li, Yong
Accurate origin-destination (OD) flow prediction is of great importance to developing cities, as it can contribute to optimize urban structures and layouts. However, with the common issues of missing regional features and lacking OD flow data, it is quite daunting to predict OD flow in developing cities. To address this challenge, we propose a novel Causality-Enhanced OD Flow Prediction (CE-OFP), a unified framework that aims to transfer urban knowledge between cities and achieve accuracy improvements in OD flow predictions across data-scarce cities. In specific, we propose a novel reinforcement learning model to discover universal causalities among urban features in data-rich cities and build corresponding causal graphs. Then, we further build Causality-Enhanced Variational Auto-Encoder (CE-VAE) to incorporate causal graphs for effective feature reconstruction in data-scarce cities. Finally, with the reconstructed features, we devise a knowledge distillation method with a graph attention network to migrate the OD prediction model from data-rich cities to data-scare cities. Extensive experiments on two pairs of real-world datasets validate that the proposed CE-OFP remarkably outperforms state-of-the-art baselines, which can reduce the RMSE of OD flow prediction for data-scarce cities by up to 11%.