Goto

Collaborating Authors

 Performance Analysis


Profile Graphical Models

Avalos-Pacheco, Alejandra, Lupparelli, Monia, Stingo, Francesco C.

arXiv.org Machine Learning

We introduce a novel class of graphical models, termed profile graphical models, that represent, within a single graph, how an external factor influences the dependence structure of a multivariate set of variables. This class is quite general and includes multiple graphs and chain graphs as special cases. Profile graphical models capture the conditional distributions of a multivariate random vector given different levels of a risk factor, and learn how the conditional independence structure among variables may vary across these risk profiles; we formally define this family of models and establish their corresponding Markov properties. We derive key structural and probabilistic properties that underpin a more powerful inferential framework than existing approaches, underscoring that our contribution extends beyond a novel graphical representation.Furthermore, we show that the resulting profile undirected graphical models are independence-compatible with two-block LWF chain graph models.We then develop a Bayesian approach for Gaussian undirected profile graphical models based on continuous spike-and-slab priors to learn shared sparsity structures across different levels of the risk factor. We also design a fast EM algorithm for efficient inference. Inferential properties are explored through simulation studies, including the comparison with competing methods. The practical utility of this class of models is demonstrated through the analysis of protein network data from various subtypes of acute myeloid leukemia. Our results show a more parsimonious network and greater patient heterogeneity than its competitors, highlighting its enhanced ability to capture subject-specific differences.


KANEL: Kolmogorov-Arnold Network Ensemble Learning Enables Early Hit Enrichment in High-Throughput Virtual Screening

Koptev, Pavel, Krainov, Nikita, Malkov, Konstantin, Tropsha, Alexander

arXiv.org Machine Learning

Machine learning models of chemical bioactivity are increasingly used for prioritizing a small number of compounds in virtual screening libraries for experimental follow-up. In these applications, assessing model accuracy by early hit enrichment such as Positive Predicted Value (PPV) calculated for top N hits (PPV@N) is more appropriate and actionable than traditional global metrics such as AUC. We present KANEL, an ensemble workflow that combines interpretable Kolmogorov-Arnold Networks (KANs) with XGBoost, random forest, and multilayer perceptron models trained on complementary molecular representations (LillyMol descriptors, RDKit-derived descriptors, and Morgan fingerprints). Across five public PubChem BioAssay datasets (AIDs 485314, 485341, 504466, 624202, and 651820), Optuna-optimized weighted ensembles consistently outperformed the best single model in PPV@128 by 0.06-0.12


Identification of physiological shock in intensive care units via Bayesian regime switching models

Kendall, Emmett B., Williams, Jonathan P., Storlie, Curtis B., Radosevich, Misty A., Wittwer, Erica D., Warner, Matthew A.

arXiv.org Machine Learning

Detection of occult hemorrhage (i.e., internal bleeding) in patients in intensive care units (ICUs) can pose significant challenges for critical care workers. Because blood loss may not always be clinically apparent, clinicians rely on monitoring vital signs for specific trends indicative of a hemorrhage event. The inherent difficulties of diagnosing such an event can lead to late intervention by clinicians which has catastrophic consequences. Therefore, a methodology for early detection of hemorrhage has wide utility. We develop a Bayesian regime switching model (RSM) that analyzes trends in patients' vitals and labs to provide a probabilistic assessment of the underlying physiological state that a patient is in at any given time. This article is motivated by a comprehensive dataset we curated from Mayo Clinic of 33,924 real ICU patient encounters. Longitudinal response measurements are modeled as a vector autoregressive process conditional on all latent states up to the current time point, and the latent states follow a Markov process. We present a novel Bayesian sampling routine to learn the posterior probability distribution of the latent physiological states, as well as develop an approach to account for pre-ICU-admission physiological changes. A simulation and real case study illustrate the effectiveness of our approach.


Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method

Yu, Chenxu, Fang, Wenqi

arXiv.org Machine Learning

As a representative continuous-depth neural network approach, stochastic differential equation (SDE)-based Bayesian neural networks (BNNs) have attracted considerable attention due to their solid theoretical foundations and strong potential for real-world applications. However, their reliance on numerical SDE solvers inevitably incurs a large number of function evaluations (NFEs), resulting in high computational cost and occasional convergence instability. To address these challenges, we propose a Nesterov-accelerated gradient (NAG) enhanced SDE-BNN model. By integrating NAG into the SDE-BNN framework along with an NFE-dependent residual skip connection, our method accelerates convergence and substantially reduces NFEs during both training and testing. Extensive empirical results show that our model consistently outperforms conventional SDE-BNNs across various tasks, including image classification and sequence modeling, achieving lower NFEs and improved predictive accuracy.


Elements of Conformal Prediction for Statisticians

Sesia, Matteo, Favaro, Stefano

arXiv.org Machine Learning

Predictive inference is a fundamental task in statistics, traditionally addressed using parametric assumptions about the data distribution and detailed analyses of how models learn from data. In recent years, conformal prediction has emerged as a rapidly growing alternative framework that is particularly well suited to modern applications involving high-dimensional data and complex machine learning models. Its appeal stems from being both distribution-free -- relying mainly on symmetry assumptions such as exchangeability -- and model-agnostic, treating the learning algorithm as a black box. Even under such limited assumptions, conformal prediction provides exact finite-sample guarantees, though these are typically of a marginal nature that requires careful interpretation. This paper explains the core ideas of conformal prediction and reviews selected methods. Rather than offering an exhaustive survey, it aims to provide a clear conceptual entry point and a pedagogical overview of the field.


From Cross-Validation to SURE: Asymptotic Risk of Tuned Regularized Estimators

Adusumilli, Karun, Kasy, Maximilian, Wilson, Ashia

arXiv.org Machine Learning

We derive the asymptotic risk function of regularized empirical risk minimization (ERM) estimators tuned by $n$-fold cross-validation (CV). The out-of-sample prediction loss of such estimators converges in distribution to the squared-error loss (risk function) of shrinkage estimators in the normal means model, tuned by Stein's unbiased risk estimate (SURE). This risk function provides a more fine-grained picture of predictive performance than uniform bounds on worst-case regret, which are common in learning theory: it quantifies how risk varies with the true parameter. As key intermediate steps, we show that (i) $n$-fold CV converges uniformly to SURE, and (ii) while SURE typically has multiple local minima, its global minimum is generically well separated. Well-separation ensures that uniform convergence of CV to SURE translates into convergence of the tuning parameter chosen by CV to that chosen by SURE.


From Causal Discovery to Dynamic Causal Inference in Neural Time Series

Kuskova, Valentina, Zaytsev, Dmitry, Coppedge, Michael

arXiv.org Machine Learning

Time-varying causal models provide a powerful framework for studying dynamic scientific systems, yet most existing approaches assume that the underlying causal network is known a priori - an assumption rarely satisfied in real-world domains where causal structure is uncertain, evolving, or only indirectly observable. This limits the applicability of dynamic causal inference in many scientific settings. We propose Dynamic Causal Network Autoregression (DCNAR), a two-stage neural causal modeling framework that integrates data-driven causal discovery with time-varying causal inference. In the first stage, a neural autoregressive causal discovery model learns a sparse directed causal network from multivariate time series. In the second stage, this learned structure is used as a structural prior for a time-varying neural network autoregression, enabling dynamic estimation of causal influence without requiring pre-specified network structure. We evaluate the scientific validity of DCNAR using behavioral diagnostics that assess causal necessity, temporal stability, and sensitivity to structural change, rather than predictive accuracy alone. Experiments on multi-country panel time-series data demonstrate that learned causal networks yield more stable and behaviorally meaningful dynamic causal inferences than coefficient-based or structure-free alternatives, even when forecasting performance is comparable. These results position DCNAR as a general framework for using AI as a scientific instrument for dynamic causal reasoning under structural uncertainty.


A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks

Boubekraoui, Maryam, d'Aloisio, Giordano, Di Marco, Antinisca

arXiv.org Machine Learning

The widespread use of AI and ML models in sensitive areas raises significant concerns about fairness. While the research community has introduced various methods for bias mitigation in binary classification tasks, the issue remains under-explored in multi-class classification settings. To address this limitation, in this paper, we first formulate the problem of fair learning in multi-class classification as a multi-objective problem between effectiveness (i.e., prediction correctness) and multiple linear fairness constraints. Next, we propose a Generalised Exponentiated Gradient (GEG) algorithm to solve this task. GEG is an in-processing algorithm that enhances fairness in binary and multi-class classification settings under multiple fairness definitions. We conduct an extensive empirical evaluation of GEG against six baselines across seven multi-class and three binary datasets, using four widely adopted effectiveness metrics and three fairness definitions. GEG overcomes existing baselines, with fairness improvements up to 92% and a decrease in accuracy up to 14%.



A Model Ensemble-Based Post-Processing Framework for Fairness-Aware Prediction

Zhao, Zhouting, Ng, Tin Lok James

arXiv.org Machine Learning

Striking an optimal balance between predictive performance and fairness continues to be a fundamental challenge in machine learning. In this work, we propose a post-processing framework that facilitates fairness-aware prediction by leveraging model ensembling. Designed to operate independently of any specific model internals, our approach is widely applicable across various learning tasks, model architectures, and fairness definitions. Through extensive experiments spanning classification, regression, and survival analysis, we demonstrate that the framework effectively enhances fairness while maintaining, or only minimally affecting, predictive accuracy.