Bayesian Learning
Learning Time-Varying Coverage Functions
Coverage functions are an important class of discrete functions that capture laws of diminishing returns. In this paper, we propose a new problem of learning time-varying coverage functions which arise naturally from applications in social network analysis, machine learning, and algorithmic game theory. We develop a novel parametrization of the time-varying coverage function by illustrating the connections with counting processes. We present an efficient algorithm to learn the parameters by maximum likelihood estimation, and provide a rigorous theoretic analysis of its sample complexity. Empirical experiments from information diffusion in social network analysis demonstrate that with few assumptions about the underlying diffusion process, our method performs significantly better than existing approaches on both synthetic and real world data.
Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events, with applications to sequential recommendation, user behavior analysis and clinical treatment. In practice, the next-event prediction models are trained with sequential data collected at one time and need to generalize to newly arrived sequences in remote future, which requires models to handle temporal distribution shift from training to testing. In this paper, we first take a data-generating perspective to reveal a negative result that existing approaches with maximum likelihood estimation would fail for distribution shift due to the latent context confounder, i.e., the common cause for the historical events and the next event. Then we devise a new learning objective based on backdoor adjustment and further harness variational inference to make it tractable for sequence learning problems. On top of that, we propose a framework with hierarchical branching structures for learning context-specific representations.
Tracking Functional Changes in Nonstationary Signals with Evolutionary Ensemble Bayesian Model for Robust Neural Decoding
Neural signals are typical nonstationary data where the functional mapping between neural activities and the intentions (such as the velocity of movements) can occasionally change. Existing studies mostly use a fixed neural decoder, thus suffering from an unstable performance given neural functional changes. We propose a novel evolutionary ensemble framework (EvoEnsemble) to dynamically cope with changes in neural signals by evolving the decoder model accordingly. EvoEnsemble integrates evolutionary computation algorithms in a Bayesian framework where the fitness of models can be sequentially computed with their likelihoods according to the incoming data at each time slot, which enables online tracking of time-varying functions. Two strategies of evolve-at-changes and history-model-archive are designed to further improve efficiency and stability.
Advances in Learning Bayesian Networks of Bounded Treewidth
This work presents novel algorithms for learning Bayesian networks of bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in sampling k-trees (maximal graphs of treewidth k), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that k-tree. The approaches are empirically compared to each other and to state-of-the-art methods on a collection of public data sets with up to 100 variables.
Learning Fast-Inference Bayesian Networks
We propose new methods for learning Bayesian networks (BNs) that reliably support fast inference. We utilize maximum state space size as a more fine-grained measure for the BN's reasoning complexity than the standard treewidth measure, thereby accommodating the possibility that variables range over domains of different sizes. Our methods combine heuristic BN structure learning algorithms with the recently introduced MaxSAT-powered local improvement method (Peruvemba Ramaswamy and Szeider, AAAI'21). Our experiments show that our new learning methods produce BNs that support significantly faster exact probabilistic inference than BNs learned with treewidth bounds.
Moment Matching Denoising Gibbs Sampling
However, training and sampling from EBMs continue to pose significant challenges. The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues, causing the energy model to learn a noisy data distribution. In this work, we propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a noisy model that has been well-trained via DSM. We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets.
Deep Learning for Early Alzheimer Disease Detection with MRI Scans
Rafsan, Mohammad, Oraby, Tamer, Roy, Upal, Kumar, Sanjeev, Rodrigo, Hansapani
Alzheimer's Disease is a neurodegenerative condition characterized by dementia and impairment in neurological function. The study primarily focuses on the individuals above age 40, affecting their memory, behavior, and cognitive processes of the brain. Alzheimer's disease requires diagnosis by a detailed assessment of MRI scans and neuropsychological tests of the patients. This project compares existing deep learning models in the pursuit of enhancing the accuracy and efficiency of AD diagnosis, specifically focusing on the Convolutional Neural Network, Bayesian Convolutional Neural Network, and the U-net model with the Open Access Series of Imaging Studies brain MRI dataset. Besides, to ensure robustness and reliability in the model evaluations, we address the challenge of imbalance in data. We then perform rigorous evaluation to determine strengths and weaknesses for each model by considering sensitivity, specificity, and computational efficiency. This comparative analysis would shed light on the future role of AI in revolutionizing AD diagnostics but also paved ways for future innovation in medical imaging and the management of neurodegenerative diseases.
Mean and Variance Estimation Complexity in Arbitrary Distributions via Wasserstein Minimization
Iverson, Valentio, Vavasis, Stephen
Parameter estimation is a fundamental challenge in machine learning, crucial for tasks such as neural network weight fitting and Bayesian inference. This paper focuses on the complexity of estimating translation $\boldsymbol{\mu} \in \mathbb{R}^l$ and shrinkage $\sigma \in \mathbb{R}_{++}$ parameters for a distribution of the form $\frac{1}{\sigma^l} f_0 \left( \frac{\boldsymbol{x} - \boldsymbol{\mu}}{\sigma} \right)$, where $f_0$ is a known density in $\mathbb{R}^l$ given $n$ samples. We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $\varepsilon$-approximations for arbitrary $\varepsilon > 0$ within $\text{poly} \left( \frac{1}{\varepsilon} \right)$ time using the Wasserstein distance.
Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark
Wang, Shenhao, Mo, Baichuan, Zheng, Yunhan, Hess, Stephane, Zhao, Jinhua
Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model, thus efficiently summarizing a large number of experiments, quantifying the randomness in model comparisons, and using formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6,970 experiments from 105 models and 12 model families. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family (i.e., multinomial, nested, and mixed logit models). However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models' predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.
A recursive Bayesian neural network for constitutive modeling of sands under monotonic loading
Noor, Toiba, Lone, Soban Nasir, Ramana, G. V., Nayek, Rajdip
In geotechnical engineering, constitutive models play a crucial role in describing soil behavior under varying loading conditions. Data-driven deep learning (DL) models offer a promising alternative for developing predictive constitutive models. When prediction is the primary focus, quantifying the predictive uncertainty of a trained DL model and communicating this uncertainty to end users is crucial for informed decision-making. This study proposes a recursive Bayesian neural network (rBNN) framework, which builds upon recursive feedforward neural networks (rFFNNs) by introducing generalized Bayesian inference for uncertainty quantification. A significant contribution of this work is the incorporation of a sliding window approach in rFFNNs, allowing the models to effectively capture temporal dependencies across load steps. The rBNN extends this framework by treating model parameters as random variables, with their posterior distributions inferred using generalized variational inference. The proposed framework is validated on two datasets: (i) a numerically simulated consolidated drained (CD) triaxial dataset employing a hardening soil model and (ii) an experimental dataset comprising 28 CD triaxial tests on Baskarp sand. Comparative analyses with LSTM, Bi-LSTM, and GRU models demonstrate that the deterministic rFFNN achieves superior predictive accuracy, attributed to its transparent structure and sliding window design. While the rBNN marginally trails in accuracy for the experimental case, it provides robust confidence intervals, addressing data sparsity and measurement noise in experimental conditions. The study underscores the trade-offs between deterministic and probabilistic approaches and the potential of rBNNs for uncertainty-aware constitutive modeling.