Goto

Collaborating Authors

 Bayesian Learning


Pearl's and Jeffrey's Update as Modes of Learning in Probabilistic Programming

arXiv.org Artificial Intelligence

The concept of updating a probability distribution in the light of new evidence lies at the heart of statistics and machine learning. Pearl's and Jeffrey's rule are two natural update mechanisms which lead to different outcomes, yet the similarities and differences remain mysterious. This paper clarifies their relationship in several ways: via separate descriptions of the two update mechanisms in terms of probabilistic programs and sampling semantics, and via different notions of likelihood (for Pearl and for Jeffrey). Moreover, it is shown that Jeffrey's update rule arises via variational inference. In terms of categorical probability theory, this amounts to an analysis of the situation in terms of the behaviour of the multiset functor, extended to the Kleisli category of the distribution monad.


Learning Multiscale Non-stationary Causal Structures

arXiv.org Machine Learning

This paper addresses a gap in the current state of the art by providing a solution for modeling causal relationships that evolve over time and occur at different time scales. Specifically, we introduce the multiscale non-stationary directed acyclic graph (MN-DAG), a framework for modeling multivariate time series data. Our contribution is twofold. Firstly, we expose a probabilistic generative model by leveraging results from spectral and causality theories. Our model allows sampling an MN-DAG according to user-specified priors on the time-dependence and multiscale properties of the causal graph. Secondly, we devise a Bayesian method named Multiscale Non-stationary Causal Structure Learner (MN-CASTLE) that uses stochastic variational inference to estimate MN-DAGs. The method also exploits information from the local partial correlation between time series over different time resolutions. The data generated from an MN-DAG reproduces well-known features of time series in different domains, such as volatility clustering and serial correlation. Additionally, we show the superior performance of MN-CASTLE on synthetic data with different multiscale and non-stationary properties compared to baseline models. Finally, we apply MN-CASTLE to identify the drivers of the natural gas prices in the US market. Causal relationships have strengthened during the COVID-19 outbreak and the Russian invasion of Ukraine, a fact that baseline methods fail to capture. MN-CASTLE identifies the causal impact of critical economic drivers on natural gas prices, such as seasonal factors, economic uncertainty, oil prices, and gas storage deviations.


Designing Reconfigurable Intelligent Systems with Markov Blankets

arXiv.org Artificial Intelligence

Compute Continuum (CC) systems comprise a vast number of devices distributed over computational tiers. Evaluating business requirements, i.e., Service Level Objectives (SLOs), requires collecting data from all those devices; if SLOs are violated, devices must be reconfigured to ensure correct operation. If done centrally, this dramatically increases the number of devices and variables that must be considered, while creating an enormous communication overhead. To address this, we (1) introduce a causality filter based on Markov blankets (MB) that limits the number of variables that each device must track, (2) evaluate SLOs decentralized on a device basis, and (3) infer optimal device configuration for fulfilling SLOs. We evaluated our methodology by analyzing video stream transformations and providing device configurations that ensure the Quality of Service (QoS). The devices thus perceived their environment and acted accordingly -- a form of decentralized intelligence.


Dates Fruit Disease Recognition using Machine Learning

arXiv.org Artificial Intelligence

Many countries such as Saudi Arabia, Morocco and Tunisia are among the top exporters and consumers of palm date fruits. Date fruit production plays a major role in the economies of the date fruit exporting countries. Date fruits are susceptible to disease just like any fruit and early detection and intervention can end up saving the produce. However, with the vast farming lands, it is nearly impossible for farmers to observe date trees on a frequent basis for early disease detection. In addition, even with human observation the process is prone to human error and increases the date fruit cost. With the recent advances in computer vision, machine learning, drone technology, and other technologies; an integrated solution can be proposed for the automatic detection of date fruit disease. In this paper, a hybrid features based method with the standard classifiers is proposed based on the extraction of L*a*b color features, statistical features, and Discrete Wavelet Transform (DWT) texture features for the early detection and classification of date fruit disease. A dataset was developed for this work consisting of 871 images divided into the following classes; Healthy date, Initial stage of disease, Malnourished date, and Parasite infected. The extracted features were input to common classifiers such as the Random Forest (RF), Multilayer Perceptron (MLP), Na\"ive Bayes (NB), and Fuzzy Decision Trees (FDT). The highest average accuracy was achieved when combining the L*a*b, Statistical, and DWT Features.


An Empirical Bayes Framework for Open-Domain Dialogue Generation

arXiv.org Artificial Intelligence

To engage human users in meaningful conversation, open-domain dialogue agents are required to generate diverse and contextually coherent dialogue. Despite recent advancements, which can be attributed to the usage of pretrained language models, the generation of diverse and coherent dialogue remains an open research problem. A popular approach to address this issue involves the adaptation of variational frameworks. However, while these approaches successfully improve diversity, they tend to compromise on contextual coherence. Hence, we propose the Bayesian Open-domain Dialogue with Empirical Bayes (BODEB) framework, an empirical bayes framework for constructing an Bayesian open-domain dialogue agent by leveraging pretrained parameters to inform the prior and posterior parameter distributions. Empirical results show that BODEB achieves better results in terms of both diversity and coherence compared to variational frameworks.


Cognitive bias in large language models: Cautious optimism meets anti-Panglossian meliorism

arXiv.org Artificial Intelligence

The recent success of large language models gives new urgency to the question of how model performance should be evaluated. In many tasks, models can be evaluated for the accuracy of their outputs. However, models can also be evaluated along other important dimensions. For example, we can assess models for the transparency or interpretability of their judgments (Creel 2020; Vredenburgh 2022). We can also assess models for the presence of problematic biases (Kelly 2023; Johnson 2020). Most work on biases in large language models focuses on a conception of bias closely tied to unfairness, especially as affecting marginalized social groups. However, recent work has alleged that large language models also show a number of classic cognitive biases familiar from work in the psychology of reasoning, behavioral economics, and judgment and decisionmaking (Dasgupta et al. 2022; Lin and Ng 2023; Jones and Steinhardt 2022). This development is exciting because it raises the possibility of using cognitive bias as a novel metric by which to evaluate the performance of large language models.


Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and Limitations

arXiv.org Artificial Intelligence

In the growing world of artificial intelligence, federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data. Federated learning lays the groundwork for collaborative research in areas where the data is sensitive. Federated learning has several implications for real-world problems. In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data. This distributed approach enables us to leverage information from multiple sources and gain more diverse insights. This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specifically, we have presented an extensive review of supervised/unsupervised machine learning algorithms, ensemble methods, meta-heuristic approaches, blockchain technology, and reinforcement learning used in the framework of federated learning, in addition to an overview of federated learning applications. This paper reviews the literature on the components of federated learning and its applications in the last few years. The main purpose of this work is to provide researchers and practitioners with a comprehensive overview of federated learning from the machine learning point of view. A discussion of some open problems and future research directions in federated learning is also provided.


Adaptive Modelling Approach for Row-Type Dependent Predictive Analysis (RTDPA): A Framework for Designing Machine Learning Models for Credit Risk Analysis in Banking Sector

arXiv.org Artificial Intelligence

In many real-world datasets, rows may have distinct characteristics and require different modeling approaches for accurate predictions. In this paper, we propose an adaptive modeling approach for row-type dependent predictive analysis(RTDPA). Our framework enables the development of models that can effectively handle diverse row types within a single dataset. Our dataset from XXX bank contains two different risk categories, personal loan and agriculture loan. each of them are categorised into four classes standard, sub-standard, doubtful and loss. We performed tailored data pre processing and feature engineering to different row types. We selected traditional machine learning predictive models and advanced ensemble techniques. Our findings indicate that all predictive approaches consistently achieve a precision rate of no less than 90%. For RTDPA, the algorithms are applied separately for each row type, allowing the models to capture the specific patterns and characteristics of each row type. This approach enables targeted predictions based on the row type, providing a more accurate and tailored classification for the given dataset.Additionally, the suggested model consistently offers decision makers valuable and enduring insights that are strategic in nature in banking sector.


Fuse It or Lose It: Deep Fusion for Multimodal Simulation-Based Inference

arXiv.org Artificial Intelligence

We present multimodal neural posterior estimation (MultiNPE), a method to integrate heterogeneous data from different sources in simulation-based inference with neural networks. Inspired by advances in attention-based deep fusion learning, it empowers researchers to analyze data from different domains and infer the parameters of complex mathematical models with increased accuracy. We formulate different multimodal fusion approaches for MultiNPE (early, late, and hybrid) and evaluate their performance in three challenging numerical experiments. MultiNPE not only outperforms na\"ive baselines on a benchmark model, but also achieves superior inference on representative scientific models from neuroscience and cardiology. In addition, we systematically investigate the impact of partially missing data on the different fusion strategies. Across our different experiments, late and hybrid fusion techniques emerge as the methods of choice for practical applications of multimodal simulation-based inference.


Predicting the Probability of Collision of a Satellite with Space Debris: A Bayesian Machine Learning Approach

arXiv.org Artificial Intelligence

Space is becoming more crowded in Low Earth Orbit due to increased space activity. Such a dense space environment increases the risk of collisions between space objects endangering the whole space population. Therefore, the need to consider collision avoidance as part of routine operations is evident to satellite operators. Current procedures rely on the analysis of multiple collision warnings by human analysts. However, with the continuous growth of the space population, this manual approach may become unfeasible, highlighting the importance of automation in risk assessment. In 2019, ESA launched a competition to study the feasibility of applying machine learning in collision risk estimation and released a dataset that contained sequences of Conjunction Data Messages (CDMs) in support of real close encounters. The competition results showed that the naive forecast and its variants are strong predictors for this problem, which suggests that the CDMs may follow the Markov property. The proposed work investigates this theory by benchmarking Hidden Markov Models (HMM) in predicting the risk of collision between two resident space objects by using one feature of the entire dataset: the sequence of the probability in the CDMs. In addition, Bayesian statistics are used to infer a joint distribution for the parameters of the models, which allows the development of robust and reliable probabilistic predictive models that can incorporate physical or prior knowledge about the problem within a rigorous theoretical framework and provides prediction uncertainties that nicely reflect the accuracy of the predicted risk. This work shows that the implemented HMM outperforms the naive solution in some metrics, which further adds to the idea that the collision warnings may be Markovian and suggests that this is a powerful method to be further explored.