Bayesian Learning
Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models
Kaplan, Alan D., Ray, Priyadip, Greene, John D., Liu, Vincent X.
In the dynamic hospital setting, decision support can be a valuable tool for improving patient outcomes. Data-driven inference of future outcomes is challenging in this dynamic setting, where long sequences such as laboratory tests and medications are updated frequently. This is due in part to heterogeneity of data types and mixed-sequence types contained in variable length sequences. In this work we design a probabilistic unsupervised model for multiple arbitrary-length sequences contained in hospitalization Electronic Health Record (EHR) data. The model uses a latent variable structure and captures complex relationships between medications, diagnoses, laboratory tests, neurological assessments, and medications. It can be trained on original data, without requiring any lossy transformations or time binning. Inference algorithms are derived that use partial data to infer properties of the complete sequences, including their length and presence of specific values. We train this model on data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The results are evaluated against held-out data for predicting the length of sequences and presence of Intensive Care Unit (ICU) in hospitalization bed sequences. Our method outperforms a baseline approach, showing that in these experiments the trained model captures information in the sequences that is informative of their future values.
Deep Learning for Accelerated and Robust MRI Reconstruction: a Review
Heckel, Reinhard, Jacob, Mathews, Chaudhari, Akshay, Perlman, Or, Shimron, Efrat
Deep learning (DL) has recently emerged as a pivotal technology for enhancing magnetic resonance imaging (MRI), a critical tool in diagnostic radiology. This review paper provides a comprehensive overview of recent advances in DL for MRI reconstruction. It focuses on DL approaches and architectures designed to improve image quality, accelerate scans, and address data-related challenges. These include end-to-end neural networks, pre-trained networks, generative models, and self-supervised methods. The paper also discusses the role of DL in optimizing acquisition protocols, enhancing robustness against distribution shifts, and tackling subtle bias. Drawing on the extensive literature and practical insights, it outlines current successes, limitations, and future directions for leveraging DL in MRI reconstruction, while emphasizing the potential of DL to significantly impact clinical imaging practices.
Score matching for sub-Riemannian bridge sampling
Grong, Erlend, Habermann, Karen, Sommer, Stefan
Simulation of conditioned diffusion processes is an essential tool in inference for stochastic processes, data imputation, generative modelling, and geometric statistics. Whilst simulating diffusion bridge processes is already difficult on Euclidean spaces, when considering diffusion processes on Riemannian manifolds the geometry brings in further complications. In even higher generality, advancing from Riemannian to sub-Riemannian geometries introduces hypoellipticity, and the possibility of finding appropriate explicit approximations for the score of the diffusion process is removed. We handle these challenges and construct a method for bridge simulation on sub-Riemannian manifolds by demonstrating how recent progress in machine learning can be modified to allow for training of score approximators on sub-Riemannian manifolds. Since gradients dependent on the horizontal distribution, we generalise the usual notion of denoising loss to work with non-holonomic frames using a stochastic Taylor expansion, and we demonstrate the resulting scheme both explicitly on the Heisenberg group and more generally using adapted coordinates. We perform numerical experiments exemplifying samples from the bridge process on the Heisenberg group and the concentration of this process for small time.
Machine Learning Applied to the Detection of Mycotoxin in Food: A Review
Inglis, Alan, Parnell, Andrew, Subramani, Natarajan, Doohan, Fiona
Mycotoxins are a group of naturally occurring, toxic chemical compounds produced by certain species of moulds (fungi), during growth on various crops and foodstuffs, including cereals, nuts, spices and dairy products (The World Health Organization (WHO), 2023). The ingestion of certain mycotoxins has been linked to a range of harmful health impacts on both humans and animals, from short-term poisoning to long-term consequences such as liver cancer, and in some cases, death (Mavrommatis et al., 2021; Marroquín-Cardona et al., 2014; Liu and Wu, 2010). Mycotoxins are secondary metabolites (that is, compounds produced by an organism that are not essential for its primary life processes) and are often produced during the pre-harvest, harvest, and storage phases under favourable conditions of humidity and temperature (Marroquín-Cardona et al., 2014; Van der Fels-Klerx et al., 2022). The most prevalent mycotoxins include aflatoxins, tricothecenes, fumonisins, zearalenones, ochratoxins and patulin, and are produced by certain plant-pathogenic species of Aspergillus, Fusarium, and Penicillium (Tola and Kebede, 2016). Mycotoxin contamination in crop products has been found to vary significantly across different geographical locations and is influenced by annual weather conditions (Logrieco et al., 2021; Leggieri et al., 2020).
A Comparison of Traditional and Deep Learning Methods for Parameter Estimation of the Ornstein-Uhlenbeck Process
We consider the Ornstein-Uhlenbeck (OU) process, a stochastic process widely used in finance, physics, and biology. Parameter estimation of the OU process is a challenging problem. Thus, we review traditional tracking methods and compare them with novel applications of deep learning to estimate the parameters of the OU process. We use a multi-layer perceptron to estimate the parameters of the OU process and compare its performance with traditional parameter estimation methods, such as the Kalman filter and maximum likelihood estimation. We find that the multi-layer perceptron can accurately estimate the parameters of the OU process given a large dataset of observed trajectories and, on average, outperforms traditional parameter estimation methods.
Nested Inheritance Dynamics
The idea of the inheritance of biological processes, such as the developmental process or the life cycle of an organism, has been discussed in the biology literature, but formal mathematical descriptions and plausible data analysis frameworks are lacking. We introduce an extension of the nested Dirichlet Process (nDP) to a multiscale model to aid in understanding the mechanisms by which biological processes are inherited, remain stable, and are modified across generations. To address these issues, we introduce Nested Inheritance Dynamics Algorithm (NIDA). At its primary level, NIDA encompasses all processes unfolding within an individual organism's lifespan. The secondary level delineates the dynamics through which these processes evolve or remain stable over time. This framework allows for the specification of a physical system model at either scale, thus promoting seamless integration with established models of development and heredity.
Manipulating Recommender Systems: A Survey of Poisoning Attacks and Countermeasures
Nguyen, Thanh Toan, Nguyen, Quoc Viet Hung, Nguyen, Thanh Tam, Huynh, Thanh Trung, Nguyen, Thanh Thi, Weidlich, Matthias, Yin, Hongzhi
Recommender systems have become an integral part of online services to help users locate specific information in a sea of data. However, existing studies show that some recommender systems are vulnerable to poisoning attacks, particularly those that involve learning schemes. A poisoning attack is where an adversary injects carefully crafted data into the process of training a model, with the goal of manipulating the system's final recommendations. Based on recent advancements in artificial intelligence, such attacks have gained importance recently. While numerous countermeasures to poisoning attacks have been developed, they have not yet been systematically linked to the properties of the attacks. Consequently, assessing the respective risks and potential success of mitigation strategies is difficult, if not impossible. This survey aims to fill this gap by primarily focusing on poisoning attacks and their countermeasures. This is in contrast to prior surveys that mainly focus on attacks and their detection methods. Through an exhaustive literature review, we provide a novel taxonomy for poisoning attacks, formalise its dimensions, and accordingly organise 30+ attacks described in the literature. Further, we review 40+ countermeasures to detect and/or prevent poisoning attacks, evaluating their effectiveness against specific types of attacks. This comprehensive survey should serve as a point of reference for protecting recommender systems against poisoning attacks. The article concludes with a discussion on open issues in the field and impactful directions for future research. A rich repository of resources associated with poisoning attacks is available at https://github.com/tamlhp/awesome-recsys-poisoning.
Dynamic pricing with Bayesian updates from online reviews
Correa, José, Mari, Mathieu, Xia, Andrew
As a key part of modern online platforms, online decision-making plays a crucial role in a variety of settings, particularly related to the Internet. Two landmark examples that have been widely studied are dynamic pricing and online reviews. Online review systems constitute powerful platforms for users to get informed about the product and for the firm to understand how a given market is receiving the product. The study of these systems has been vast for the last two decades [6, 10], and more recently, modeling simple like/dislike reviews as bandits problems have become standard [1, 2, 3, 13, 16, 18]. Dynamic pricing, on the other hand, is an active area of research in economics, computer science, and operations research [12, 14], and has become a common practice in several industries such as transportation and retail. There has been a growing interest in combining the two areas as a way to design more effective pricing mechanisms that gather information from current reviews to update prices and make the product more attractive [5, 11, 17]. In particular, [5] considers social learning with non-Bayesian agents in a market with like & dislike reviews, and the resulting pricing decision of a monopolist.
MDDD: Manifold-based Domain Adaptation with Dynamic Distribution for Non-Deep Transfer Learning in Cross-subject and Cross-session EEG-based Emotion Recognition
Luo, Ting, Zhang, Jing, Qiu, Yingwei, Zhang, Li, Hu, Yaohua, Yu, Zhuliang, Liang, Zhen
Emotion decoding using Electroencephalography (EEG)-based affective brain-computer interfaces represents a significant area within the field of affective computing. In the present study, we propose a novel non-deep transfer learning method, termed as Manifold-based Domain adaptation with Dynamic Distribution (MDDD). The proposed MDDD includes four main modules: manifold feature transformation, dynamic distribution alignment, classifier learning, and ensemble learning. The data undergoes a transformation onto an optimal Grassmann manifold space, enabling dynamic alignment of the source and target domains. This process prioritizes both marginal and conditional distributions according to their significance, ensuring enhanced adaptation efficiency across various types of data. In the classifier learning, the principle of structural risk minimization is integrated to develop robust classification models. This is complemented by dynamic distribution alignment, which refines the classifier iteratively. Additionally, the ensemble learning module aggregates the classifiers obtained at different stages of the optimization process, which leverages the diversity of the classifiers to enhance the overall prediction accuracy. The experimental results indicate that MDDD outperforms traditional non-deep learning methods, achieving an average improvement of 3.54%, and is comparable to deep learning methods. This suggests that MDDD could be a promising method for enhancing the utility and applicability of aBCIs in real-world scenarios.
Na\"ive Bayes and Random Forest for Crop Yield Prediction
Maazallahi, Abbas, Thota, Sreehari, Kondaboina, Naga Prasad, Muktineni, Vineetha, Annem, Deepthi, Rokkam, Abhi Stephen, Amini, Mohammad Hossein, Salari, Mohammad Amir, Norouzzadeh, Payam, Snir, Eli, Rahmani, Bahareh
This study analyzes crop yield prediction in India from 1997 to 2020, focusing on various crops and key environmental factors. It aims to predict agricultural yields by utilizing advanced machine learning techniques like Linear Regression, Decision Tree, KNN, Na\"ive Bayes, K-Mean Clustering, and Random Forest. The models, particularly Na\"ive Bayes and Random Forest, demonstrate high effectiveness, as shown through data visualizations. The research concludes that integrating these analytical methods significantly enhances the accuracy and reliability of crop yield predictions, offering vital contributions to agricultural data science.