Goto

Collaborating Authors

Bayesian Learning: Overviews


Amortized Bayesian model comparison with evidential deep learning

arXiv.org Machine Learning

Comparing competing mathematical models of complex natural processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likelihood is computationally too expensive to evaluate. With this work, we propose a novel method for performing Bayesian model comparison using specialized deep learning architectures. Our method is purely simulation-based and circumvents the step of explicitly fitting all alternative models under consideration to each observed dataset. Moreover, it involves no hand-crafted summary statistics of the data and is designed to amortize the cost of simulation over multiple models and observable datasets. This makes the method applicable in scenarios where model fit needs to be assessed for a large number of datasets, so that per-dataset inference is practically infeasible. Finally, we propose a novel way to measure epistemic uncertainty in model comparison problems. We argue that this measure of epistemic uncertainty provides a unique proxy to quantify absolute evidence even in a framework which assumes that the true data-generating model is within a finite set of candidate models.


Moment-Based Domain Adaptation: Learning Bounds and Algorithms

arXiv.org Machine Learning

This thesis contributes to the mathematical foundation of domain adaptation as emerging field in machine learning. In contrast to classical statistical learning, the framework of domain adaptation takes into account deviations between probability distributions in the training and application setting. Domain adaptation applies for a wider range of applications as future samples often follow a distribution that differs from the ones of the training samples. A decisive point is the generality of the assumptions about the similarity of the distributions. Therefore, in this thesis we study domain adaptation problems under as weak similarity assumptions as can be modelled by finitely many moments.


Machine Learning Based Solutions for Security of Internet of Things (IoT): A Survey

arXiv.org Machine Learning

Over the last decade, IoT platforms have been developed into a global giant that grabs every aspect of our daily lives by advancing human life with its unaccountable smart services. Because of easy accessibility and fast-growing demand for smart devices and network, IoT is now facing more security challenges than ever before. There are existing security measures that can be applied to protect IoT. However, traditional techniques are not as efficient with the advancement booms as well as different attack types and their severeness. Thus, a strong-dynamically enhanced and up to date security system is required for next-generation IoT system. A huge technological advancement has been noticed in Machine Learning (ML) which has opened many possible research windows to address ongoing and future challenges in IoT. In order to detect attacks and identify abnormal behaviors of smart devices and networks, ML is being utilized as a powerful technology to fulfill this purpose. In this survey paper, the architecture of IoT is discussed, following a comprehensive literature review on ML approaches the importance of security of IoT in terms of different types of possible attacks. Moreover, ML-based potential solutions for IoT security has been presented and future challenges are discussed.


Natural language processing for word sense disambiguation and information extraction

arXiv.org Artificial Intelligence

This research work deals with Natural Language Processing (NLP) and extraction of essential information in an explicit form. The most common among the information management strategies is Document Retrieval (DR) and Information Filtering. DR systems may work as combine harvesters, which bring back useful material from the vast fields of raw material. With large amount of potentially useful information in hand, an Information Extraction (IE) system can then transform the raw material by refining and reducing it to a germ of original text. A Document Retrieval system collects the relevant documents carrying the required information, from the repository of texts. An IE system then transforms them into information that is more readily digested and analyzed. It isolates relevant text fragments, extracts relevant information from the fragments, and then arranges together the targeted information in a coherent framework. The thesis presents a new approach for Word Sense Disambiguation using thesaurus. The illustrative examples supports the effectiveness of this approach for speedy and effective disambiguation. A Document Retrieval method, based on Fuzzy Logic has been described and its application is illustrated. A question-answering system describes the operation of information extraction from the retrieved text documents. The process of information extraction for answering a query is considerably simplified by using a Structured Description Language (SDL) which is based on cardinals of queries in the form of who, what, when, where and why. The thesis concludes with the presentation of a novel strategy based on Dempster-Shafer theory of evidential reasoning, for document retrieval and information extraction. This strategy permits relaxation of many limitations, which are inherent in Bayesian probabilistic approach.


Sum-product networks: A survey

arXiv.org Artificial Intelligence

A sum-product network (SPN) is a probabilistic model, based on a rooted acyclic directed graph, in which terminal nodes represent univariate probability distributions and non-terminal nodes represent convex combinations (weighted sums) and products of probability functions. They are closely related to probabilistic graphical models, in particular to Bayesian networks with multiple context-specific independencies. Their main advantage is the possibility of building tractable models from data, i.e., models that can perform several inference tasks in time proportional to the number of links in the graph. They are somewhat similar to neural networks and can address the same kinds of problems, such as image processing and natural language understanding. This paper offers a survey of SPNs, including their definition, the main algorithms for inference and learning from data, the main applications, a brief review of software libraries, and a comparison with related models


A Survey on Edge Intelligence

arXiv.org Artificial Intelligence

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.


Bayesian Sparsification Methods for Deep Complex-valued Networks

arXiv.org Machine Learning

With continual miniaturization ever more applications of deep learning can be found in embedded systems, where it is common to encounter data with natural complex domain representation. To this end we extend Sparse Variational Dropout to complex-valued neural networks and verify the proposed Bayesian technique by conducting a large numerical study of the performance-compression trade-off of C-valued networks on two tasks: image recognition on MNIST-like and CIFAR10 datasets and music transcription on MusicNet. We replicate the state-of-the-art result by Trabelsi et al. [2018] on MusicNet with a complex-valued network compressed by 50-100x at a small performance penalty.


Julia Language in Machine Learning: Algorithms, Applications, and Open Issues

arXiv.org Machine Learning

Machine learning is driving development across many fields in science and engineering. A simple and efficient programming language could accelerate applications of machine learning in various fields. Currently, the programming languages most commonly used to develop machine learning algorithms include Python, MATLAB, and C/C ++. However, none of these languages well balance both efficiency and simplicity. The Julia language is a fast, easy-to-use, and open-source programming language that was originally designed for high-performance computing, which can well balance the efficiency and simplicity. This paper summarizes the related research work and developments in the application of the Julia language in machine learning. It first surveys the popular machine learning algorithms that are developed in the Julia language. Then, it investigates applications of the machine learning algorithms implemented with the Julia language. Finally, it discusses the open issues and the potential future directions that arise in the use of the Julia language in machine learning.


On Information Plane Analyses of Neural Network Classifiers -- A Review

arXiv.org Machine Learning

We review the current literature concerned with information plane analyses of neural network classifiers. While the underlying information bottleneck theory and the claim that information-theoretic compression is causally linked to generalization are plausible, empirical evidence was found to be both supporting and conflicting. We review this evidence together with a detailed analysis how the respective information quantities were estimated. Our analysis suggests that compression visualized in information planes is not information-theoretic, but is rather compatible with geometric compression of the activations.


Redistribution Systems and PRAM

arXiv.org Artificial Intelligence

Redistribution systems iteratively redistribute mass between groups under the control of rules. PRAM is a framework for building redistribution systems. We discuss the relationships between redistribution systems, agent-based systems, compartmental models and Bayesian models. PRAM puts agent-based models on a sound probabilistic footing by reformulating them as redistribution systems. This provides a basis for integrating agent-based and probabilistic models. \pram/ extends the themes of probabilistic relational models and lifted inference to incorporate dynamical models and simulation. We illustrate PRAM with an epidemiological example.