Bayesian Learning
Consistency of Selection Strategies for Fraud Detection
Revelas, Christos, Boldea, Otilia, Werker, Bas J. M.
This paper studies how insurers can chose which claims to investigate for fraud. Given a prediction model, typically only claims with the highest predicted propability of being fraudulent are investigated. We argue that this can lead to inconsistent learning and propose a randomized alternative. More generally, we draw a parallel with the multi-arm bandit literature and argue that, in the presence of selection, the obtained observations are not iid. Hence, dependence on past observations should be accounted for when updating parameter estimates. We formalize selection in a binary regression framework and show that model updating and maximum-likelihood estimation can be implemented as if claims were investigated at random. Then, we define consistency of selection strategies and conjecture sufficient conditions for consistency. Our simulations suggest that the often-used selection strategy can be inconsistent while the proposed randomized alternative is consistent. Finally, we compare our randomized selection strategy with Thompson sampling, a standard multi-arm bandit heuristic. Our simulations suggest that the latter can be inefficient in learning low fraud probabilities.
Statistical Insight into Meta-Learning via Predictor Subspace Characterization and Quantification of Task Diversity
Datta, Saptati, Hengartner, Nicolas W., Pimonova, Yulia, Klein, Natalie E., Lubbers, Nicholas
In recent years, there has been significant interest in designing machine learning algorithms that enable robust and sample-efficient knowledge transfer across tasks to facilitate rapid and accurate estimation and prediction. Traditional machine learning methods have largely followed a single-task or "isolated learning" framework, where each task is learned independently, ignoring knowledge from prior tasks (Upadhyay et al., 2024). However, unlike such isolated approaches, human learning relies on prior experiences to accelerate new learning. Inspired by this, recent prominent "knowledge-transfer" approaches include meta-learning (Finn et al., 2017; Bouchattaoui, 2024), transfer learning (Zhu et al., 2023; Zhuang et al., 2020), multi-task learning (Crawshaw, 2020; Zhang and Yang, 2022), and lifelong learning (Liu, 2017), all of which aim to leverage shared structure across tasks to improve generalization and aim to replicate this human-like knowledge transfer. Meta-learning focuses on learning a learning algorithm that can quickly adapt to new tasks using limited data. Transfer learning reuses knowledge from related source tasks to improve performance on a target task with few labeled examples.
Early Prediction of In-Hospital ICU Mortality Using Innovative First-Day Data: A Review
Huang, Baozhu, Chen, Cheng, Hou, Xuanhe, Huang, Junmin, Wei, Zihan, Luo, Hongying, Chen, Lu, Xu, Yongzhi, Luo, Hejiao, Qin, Changqi, Bi, Ziqian, Song, Junhao, Wang, Tianyang, Liang, ChiaXin, Yu, Zizhong, Wang, Han, Sun, Xiaotian, Hao, Junfeng, Tian, Chunjie
The intensive care unit (ICU) manages critically ill patients, many of whom face a high risk of mortality. Early and accurate prediction of in-hospital mortality within the first 24 hours of ICU admission is crucial for timely clinical interventions, resource optimization, and improved patient outcomes. Traditional scoring systems, while useful, often have limitations in predictive accuracy and adaptability. Objective: This review aims to systematically evaluate and benchmark innovative methodologies that leverage data available within the first day of ICU admission for predicting in-hospital mortality. We focus on advancements in machine learning, novel biomarker applications, and the integration of diverse data types.
Bayesian Calibration and Model Assessment of Cell Migration Dynamics with Surrogate Model Integration
Schenk, Christina, Jiménez, Jacobo Ayensa, Romero, Ignacio
Computational models provide crucial insights into complex biological processes such as cancer evolution, but their mechanistic nature often makes them nonlinear and parameter-rich, complicating calibration. We systematically evaluate parameter probability distributions in cell migration models using Bayesian calibration across four complementary strategies: parametric and surrogate models, each with and without explicit model discrepancy. This approach enables joint analysis of parameter uncertainty, predictive performance, and interpretability. Applied to a real data experiment of glioblastoma progression in microfluidic devices, surrogate models achieve higher computational efficiency and predictive accuracy, whereas parametric models yield more reliable parameter estimates due to their mechanistic grounding. Incorporating model discrepancy exposes structural limitations, clarifying where model refinement is necessary. Together, these comparisons offer practical guidance for calibrating and improving computational models of complex biological systems.
Towards Privacy-Aware Bayesian Networks: A Credal Approach
Rocchi, Niccolò, Stella, Fabio, de Campos, Cassio
Bayesian networks (BN) are probabilistic graphical models that enable efficient knowledge representation and inference. These have proven effective across diverse domains, including healthcare, bioinformatics and economics. The structure and parameters of a BN can be obtained by domain experts or directly learned from available data. However, as privacy concerns escalate, it becomes increasingly critical for publicly released models to safeguard sensitive information in training data. Typically, released models do not prioritize privacy by design. In particular, tracing attacks from adversaries can combine the released BN with auxiliary data to determine whether specific individuals belong to the data from which the BN was learned. State-of-the-art protection tecniques involve introducing noise into the learned parameters. While this offers robust protection against tracing attacks, it significantly impacts the model's utility, in terms of both the significance and accuracy of the resulting inferences. Hence, high privacy may be attained at the cost of releasing a possibly ineffective model. This paper introduces credal networks (CN) as a novel solution for balancing the model's privacy and utility. After adapting the notion of tracing attacks, we demonstrate that a CN enables the masking of the learned BN, thereby reducing the probability of successful attacks. As CNs are obfuscated but not noisy versions of BNs, they can achieve meaningful inferences while safeguarding privacy. Moreover, we identify key learning information that must be concealed to prevent attackers from recovering the underlying BN. Finally, we conduct a set of numerical experiments to analyze how privacy gains can be modulated by tuning the CN hyperparameters. Our results confirm that CNs provide a principled, practical, and effective approach towards the development of privacy-aware probabilistic graphical models.
Probabilistic Machine Learning for Uncertainty-Aware Diagnosis of Industrial Systems
Mohammadi, Arman, Krysander, Mattias, Jung, Daniel, Frisk, Erik
Deep neural networks has been increasingly applied in fault diagnostics, where it uses historical data to capture systems behavior, bypassing the need for high-fidelity physical models. However, despite their competence in prediction tasks, these models often struggle with the evaluation of their confidence. This matter is particularly important in consistency-based diagnosis where decision logic is highly sensitive to false alarms. To address this challenge, this work presents a diagnostic framework that uses ensemble probabilistic machine learning to improve diagnostic characteristics of data driven consistency based diagnosis by quantifying and automating the prediction uncertainty. The proposed method is evaluated across several case studies using both ablation and comparative analyses, showing consistent improvements across a range of diagnostic metrics.
Enhanced Interpretable Knowledge Tracing for Students Performance Prediction with Human understandable Feature Space
Knowledge Tracing (KT) plays a central role in assessing students' skill mastery and predicting their future performance. While deep learning-based KT models achieve superior predictive accuracy compared to traditional methods, their complexity and opacity hinder their ability to provide psychologically meaningful explanations. This disconnect between model parameters and cognitive theory poses challenges for understanding and enhancing the learning process, limiting their trustworthiness in educational applications. To address these challenges, we enhance interpretable KT models by exploring human-understandable features derived from students' interaction data. By incorporating additional features, particularly those reflecting students' learning abilities, our enhanced approach improves predictive accuracy while maintaining alignment with cognitive theory. Our contributions aim to balance predictive power with interpretability, advancing the utility of adaptive learning systems.
ERFC: Happy Customers with Emotion Recognition and Forecasting in Conversation in Call Centers
Debsharma, Aditi, Jagyasi, Bhushan, Sen, Surajit, Pandey, Priyanka, Dovari, Devicharith, C, Yuvaraj V., Parida, Rosalin, Contractor, Gopali
Emotion Recognition in Conversation has been seen to be widely applicable in call center analytics, opinion mining, finance, retail, healthcare, and other industries. In a call center scenario, the role of the call center agent is not just confined to receiving calls but to also provide good customer experience by pacifying the frustration or anger of the customers. This can be achieved by maintaining neutral and positive emotion from the agent. As in any conversation, the emotion of one speaker is usually dependent on the emotion of other speaker. Hence the positive emotion of an agent, accompanied with the right resolution will help in enhancing customer experience. This can change an unhappy customer to a happy one. Imparting the right resolution at right time becomes easier if the agent has the insight of the emotion of future utterances. To predict the emotions of the future utterances we propose a novel architecture, Emotion Recognition and Forecasting in Conversation. Our proposed ERFC architecture considers multi modalities, different attributes of emotion, context and the interdependencies of the utterances of the speakers in the conversation. Our intensive experiments on the IEMOCAP dataset have shown the feasibility of the proposed ERFC. This approach can provide a tremendous business value for the applications like call center, where the happiness of customer is utmost important.
Robust and continuous machine learning of usage habits to adapt digital interfaces to user needs
The paper presents a machine learning approach to design digital interfaces that can dynamically adapt to different users and usage strategies. The algorithm uses Bayesian statistics to model users' browsing behavior, focusing on their habits rather than g roup preferences. It is distinguished by its online incremental learning, allowing reliable predictions even with little data and in the case of a changing environment. This inference method generates a task model, providing a graphical representation of n avigation with the usage statistics of the current user. The algorithm learns new tasks while preserving prior knowledge. The theoretical framework is described, and simulations show the effectiveness of the approach in stationary and non - stationary environments. In conclusion, this research paves the way for adaptive systems that improve the user experience by helping them to better navigate and act on their inter face. The reasons given include that it would be too oriented toward machine learning to speak to a community of HCI researchers and not concrete enough, as well as other reasons that we largely dispute. In light of the comments from the two reviewers, it appears that our non - parametric Bayesian approach was not understood, nor the crucial issue of "sequential, continuous and robust learning" for the design of adaptive user interfaces. 2 1 INTRODUCTION Users are all different. Some have no particular constraints but have usage habits and preferences. Others, such as people with disabilities or seniors, may have, in addition to these habits, constraints when using a digital service. These constraints can be very diverse, of a perceptual nature (visual, auditory, tactile), of a motor nature (pointing, manipulation, speech) or cognitive (reasoning, memory, comprehension, reading...). Consequently, any service, any interface should be able to adjust to these constraints.
Functional effects models: Accounting for preference heterogeneity in panel data with machine learning
In this paper, we present a general specification for Functional Effects Models, which use Machine Learning (ML) methodologies to learn individual-specific preference parameters from socio-demographic characteristics, therefore accounting for inter-individual heterogeneity in panel choice data. We identify three specific advantages of the Functional Effects Model over traditional fixed, and random/mixed effects models: (i) by mapping individual-specific effects as a function of socio-demographic variables, we can account for these effects when forecasting choices of previously unobserved individuals (ii) the (approximate) maximum-likelihood estimation of functional effects avoids the incidental parameters problem of the fixed effects model, even when the number of observed choices per individual is small; and (iii) we do not rely on the strong distributional assumptions of the random effects model, which may not match reality. We learn functional intercept and functional slopes with powerful non-linear machine learning regressors for tabular data, namely gradient boosting decision trees and deep neural networks. We validate our proposed methodology on a synthetic experiment and three real-world panel case studies, demonstrating that the Functional Effects Model: (i) can identify the true values of individual-specific effects when the data generation process is known; (ii) outperforms both state-of-the-art ML choice modelling techniques that omit individual heterogeneity in terms of predictive performance, as well as traditional static panel choice models in terms of learning inter-individual heterogeneity. The results indicate that the FI-RUMBoost model, which combines the individual-specific constants of the Functional Effects Model with the complex, non-linear utilities of RUMBoost, performs marginally best on large-scale revealed preference panel data.