Bayesian Learning
Collective Relational Inference for learning heterogeneous interactions
Han, Zhichao, Fink, Olga, Kammer, David S.
Interacting systems are ubiquitous in nature and engineering, ranging from particle dynamics in physics to functionally connected brain regions. These interacting systems can be modeled by graphs where edges correspond to the interactions between interactive entities. Revealing interaction laws is of fundamental importance but also particularly challenging due to underlying configurational complexities. The associated challenges become exacerbated for heterogeneous systems that are prevalent in reality, where multiple interaction types coexist simultaneously and relational inference is required. Here, we propose a novel probabilistic method for relational inference, which possesses two distinctive characteristics compared to existing methods. First, it infers the interaction types of different edges collectively by explicitly encoding the correlation among incoming interactions with a joint distribution, and second, it allows handling systems with variable topological structure over time. We evaluate the proposed methodology across several benchmark datasets and demonstrate that it outperforms existing methods in accurately inferring interaction types. We further show that when combined with known constraints, it allows us, for example, to discover physics-consistent interaction laws of particle systems. Overall the proposed model is data-efficient and generalizable to large systems when trained on smaller ones. The developed methodology constitutes a key element for understanding interacting systems and may find application in graph structure learning.
Score-Based Generative Models for PET Image Reconstruction
Singh, Imraj RD, Denker, Alexander, Barbano, Riccardo, Kereta, Željko, Jin, Bangti, Thielemans, Kris, Maass, Peter, Arridge, Simon
Score-based generative models have demonstrated highly promising results for medical image reconstruction tasks in magnetic resonance imaging or computed tomography. However, their application to Positron Emission Tomography (PET) is still largely unexplored. PET image reconstruction involves a variety of challenges, including Poisson noise with high variance and a wide dynamic range. To address these challenges, we propose several PET-specific adaptations of score-based generative models. The proposed framework is developed for both 2D and 3D PET. In addition, we provide an extension to guided reconstruction using magnetic resonance images. We validate the approach through extensive 2D and 3D $\textit{in-silico}$ experiments with a model trained on patient-realistic data without lesions, and evaluate on data without lesions as well as out-of-distribution data with lesions. This demonstrates the proposed method's robustness and significant potential for improved PET reconstruction.
Bayesian identification of nonseparable Hamiltonians with multiplicative noise using deep learning and reduced-order modeling
Galioto, Nicholas, Sharma, Harsh, Kramer, Boris, Gorodetsky, Alex Arkady
This paper presents a structure-preserving Bayesian approach for learning nonseparable Hamiltonian systems using stochastic dynamic models allowing for statistically-dependent, vector-valued additive and multiplicative measurement noise. The approach is comprised of three main facets. First, we derive a Gaussian filter for a statistically-dependent, vector-valued, additive and multiplicative noise model that is needed to evaluate the likelihood within the Bayesian posterior. Second, we develop a novel algorithm for cost-effective application of Bayesian system identification to high-dimensional systems. Third, we demonstrate how structure-preserving methods can be incorporated into the proposed framework, using nonseparable Hamiltonians as an illustrative system class. We compare the Bayesian method to a state-of-the-art machine learning method on a canonical nonseparable Hamiltonian model and a chaotic double pendulum model with small, noisy training datasets. The results show that using the Bayesian posterior as a training objective can yield upwards of 724 times improvement in Hamiltonian mean squared error using training data with up to 10% multiplicative noise compared to a standard training objective. Lastly, we demonstrate the utility of the novel algorithm for parameter estimation of a 64-dimensional model of the spatially-discretized nonlinear Schr\"odinger equation with data corrupted by up to 20% multiplicative noise.
Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management
Saifullah, M., Papakonstantinou, K. G., Andriotis, C. P., Stoffels, S. M.
Optimal management of cross-asset infrastructure is a complex problem that requires adept inspection and maintenance policies addressing stochastic degradation impacts. According to the 2021 ASCE infrastructure report card [1], the US infrastructure is in fair to poor condition, earning a cumulative grade of C-, with components nearing the end of their useful lives and at high risk of failure. Pavements and bridges are indicative examples of inadequate infrastructure. One in every five miles of pavements is in poor condition, and 7.5% of bridges are structurally deficient. Economic analyses indicate that the US Department of Transportation fell 50% short of the funds required to sustain the national transportation system [1], which is also reflected in the available resources at individual State transportation agencies. The Virginia Department of Transportation, for example, reported that 50% of the State's bridges have exceeded their useful lives, and the required funds to replace them are five times greater than the estimated available funds over the next fifty years [2]. Inspection and Maintenance (I&M) policies are therefore indispensable towards efficiently distributing available economic and environmental resources for transportation systems. Making optimal decisions in complex and uncertain environments presents a variety of difficulties, including heterogeneity of asset classes, a high number of components resulting in vast state and action spaces, unreliable observations, limited availability of resources, and several related risks. Optimal solutions that define inspection and maintenance policies should thus incorporate concepts such as (i) online and offline data learning, (ii) imperfect information support, (iii) stochastic action outcomes considerations, and (iv) optimization of long-term goals under multiple constraints (e.g., safety targets or resource constraints).
Towards Improved Variational Inference for Deep Bayesian Models
Deep learning has revolutionized the last decade, being at the forefront of extraordinary advances in a wide range of tasks including computer vision, natural language processing, and reinforcement learning, to name but a few. However, it is well-known that deep models trained via maximum likelihood estimation tend to be overconfident and give poorly-calibrated predictions. Bayesian deep learning attempts to address this by placing priors on the model parameters, which are then combined with a likelihood to perform posterior inference. Unfortunately, for deep models, the true posterior is intractable, forcing the user to resort to approximations. In this thesis, we explore the use of variational inference (VI) as an approximation, as it is unique in simultaneously approximating the posterior and providing a lower bound to the marginal likelihood. If tight enough, this lower bound can be used to optimize hyperparameters and to facilitate model selection. However, this capacity has rarely been used to its full extent for Bayesian neural networks, likely because the approximate posteriors typically used in practice can lack the flexibility to effectively bound the marginal likelihood. We therefore explore three aspects of Bayesian learning for deep models: 1) we ask whether it is necessary to perform inference over as many parameters as possible, or whether it is reasonable to treat many of them as optimizable hyperparameters; 2) we propose a variational posterior that provides a unified view of inference in Bayesian neural networks and deep Gaussian processes; 3) we demonstrate how VI can be improved in certain deep Gaussian process models by analytically removing symmetries from the posterior, and performing inference on Gram matrices instead of features. We hope that our contributions will provide a stepping stone to fully realize the promises of VI in the future.
Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection
Xiong, Feng, Markchom, Thanet, Zheng, Ziwei, Jung, Subin, Ojha, Varun, Liang, Huizhi
SemEval-2024 Task 8 introduces the challenge of identifying machine-generated texts from diverse Large Language Models (LLMs) in various languages and domains. The task comprises three subtasks: binary classification in monolingual and multilingual (Subtask A), multi-class classification (Subtask B), and mixed text detection (Subtask C). This paper focuses on Subtask A & B. Each subtask is supported by three datasets for training, development, and testing. To tackle this task, two methods: 1) using traditional machine learning (ML) with natural language preprocessing (NLP) for feature extraction, and 2) fine-tuning LLMs for text classification. The results show that transformer models, particularly LoRA-RoBERTa, exceed traditional ML methods in effectiveness, with majority voting being particularly effective in multilingual contexts for identifying machine-generated texts.
Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation
Li, Jiachen, Hua, Chuanbo, Ma, Hengbo, Park, Jinkyoo, Dax, Victoria, Kochenderfer, Mykel J.
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning. While modeling pairwise relations has been widely studied in multi-agent interacting systems, the ability to capture larger-scale group-wise activities is limited. In this paper, we propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures, and we demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation. In addition to the edges between pairs of nodes (i.e., agents), we propose to infer hyperedges that adaptively connect multiple nodes to enable group-wise reasoning in an unsupervised manner. Our approach infers dynamically evolving relation graphs and hypergraphs to capture the evolution of relations, which the trajectory predictor employs to generate future states. Meanwhile, we propose to regularize the sharpness and sparsity of the learned relations and the smoothness of the relation evolution, which proves to enhance training stability and model performance. The proposed approach is validated on synthetic crowd simulations and real-world benchmark datasets. Experiments demonstrate that the approach infers reasonable relations and achieves state-of-the-art prediction performance. In addition, we present a deep reinforcement learning (DRL) framework for social robot navigation, which incorporates relational reasoning and trajectory prediction systematically. In a group-based crowd simulation, our method outperforms the strongest baseline by a significant margin in terms of safety, efficiency, and social compliance in dense, interactive scenarios.
Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
Talukder, Md. Alamin, Islam, Md. Manowarul, Uddin, Md Ashraf, Hasan, Khondokar Fida, Sharmin, Selina, Alyami, Salem A., Moni, Mohammad Ali
Cybersecurity has emerged as a critical global concern. Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities. Machine Learning (ML)-based behavior analysis within the IDS has considerable potential for detecting dynamic cyber threats, identifying abnormalities, and identifying malicious conduct within the network. However, as the number of data grows, dimension reduction becomes an increasingly difficult task when training ML models. Addressing this, our paper introduces a novel ML-based network intrusion detection model that uses Random Oversampling (RO) to address data imbalance and Stacking Feature Embedding based on clustering results, as well as Principal Component Analysis (PCA) for dimension reduction and is specifically designed for large and imbalanced datasets. This model's performance is carefully evaluated using three cutting-edge benchmark datasets: UNSW-NB15, CIC-IDS-2017, and CIC-IDS-2018. On the UNSW-NB15 dataset, our trials show that the RF and ET models achieve accuracy rates of 99.59% and 99.95%, respectively. Furthermore, using the CIC-IDS2017 dataset, DT, RF, and ET models reach 99.99% accuracy, while DT and RF models obtain 99.94% accuracy on CIC-IDS2018. These performance results continuously outperform the state-of-art, indicating significant progress in the field of network intrusion detection. This achievement demonstrates the efficacy of the suggested methodology, which can be used practically to accurately monitor and identify network traffic intrusions, thereby blocking possible threats.
Expert-Driven Monitoring of Operational ML Models
Leest, Joran, Raibulet, Claudia, Gerostathopoulos, Ilias, Lago, Patricia
We propose Expert Monitoring, an approach that leverages domain expertise to enhance the detection and mitigation of concept drift in machine learning (ML) models. Our approach supports practitioners by consolidating domain expertise related to concept drift-inducing events, making this expertise accessible to on-call personnel, and enabling automatic adaptability with expert oversight.
Spatial and Temporal Hierarchy for Autonomous Navigation using Active Inference in Minigrid Environment
de Tinguy, Daria, van de Maele, Toon, Verbelen, Tim, Dhoedt, Bart
Robust evidence suggests that humans explore their environment using a combination of topological landmarks and coarse-grained path integration. This approach relies on identifiable environmental features (topological landmarks) in tandem with estimations of distance and direction (coarse-grained path integration) to construct cognitive maps of the surroundings. This cognitive map is believed to exhibit a hierarchical structure, allowing efficient planning when solving complex navigation tasks. Inspired by human behaviour, this paper presents a scalable hierarchical active inference model for autonomous navigation, exploration, and goal-oriented behaviour. The model uses visual observation and motion perception to combine curiosity-driven exploration with goal-oriented behaviour. Motion is planned using different levels of reasoning, i.e., from context to place to motion. This allows for efficient navigation in new spaces and rapid progress toward a target. By incorporating these human navigational strategies and their hierarchical representation of the environment, this model proposes a new solution for autonomous navigation and exploration. The approach is validated through simulations in a mini-grid environment.