Goto

Collaborating Authors

 Directed Networks


Gaussian processes with linear operator inequality constraints

arXiv.org Machine Learning

This paper presents an approach for constrained Gaussian Process (GP) regression where we assume that a set of linear transformations of the process are bounded. It is motivated by machine learning applications for high-consequence engineering systems, where this kind of information is often made available from phenomenological knowledge, and the resulting constraints may be essential to achieve the level of confidence needed. We consider a GP $f$ over functions on $\mathcal{X} \subset \mathbb{R}^{n}$ taking values in $\mathbb{R}$, where the process $\mathcal{L}f$ is still Gaussian when $\mathcal{L}$ is a linear operator. Our goal is to model $f$ under the constraint that realizations of $\mathcal{L}f$ are confined to a convex set of functions. In particular we require that $a \leq \mathcal{L}f \leq b$ given two functions $a$ and $b$ where $a < b$ pointwise. This formulation provides a consistent way of encoding multiple linear constraints, such as shape-constraints based on e.g. boundedness, monotonicity or convexity as a relevant example. We adopt the approach of using a sufficiently dense set of virtual observation locations where the constraint is required to hold, and derive the exact posterior for a conjugate likelihood. The results needed for stable numerical implementation are derived, together with an efficient sampling scheme for estimating the posterior process which is exact in the limit. A few numerical examples focusing on noiseless observations are given. This is relevant for computer code emulation and is also more computationally demanding than the alternative scenario with i.i.d. Gaussian noise.


Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus

arXiv.org Machine Learning

Diabetes mellitus is a common disease of human body caused by a group of metabolic disorders where the sugar levels over a prolonged period is very high. It affects different organs of the human body which thus harm a large number of the body's system, in particular the blood veins and nerves. Early prediction in such disease can be controlled and save human life. To achieve the goal, this research work mainly explores various risk factors related to this disease using machine learning techniques. Machine learning techniques provide efficient result to extract knowledge by constructing predicting models from diagnostic medical datasets collected from the diabetic patients. Extracting knowledge from such data can be useful to predict diabetic patients. In this work, we employ four popular machine learning algorithms, namely Support Vector Machine (SVM), Naive Bayes (NB), K-Nearest Neighbor (KNN) and C4.5 Decision Tree, on adult population data to predict diabetic mellitus. Our experimental results show that C4.5 decision tree achieved higher accuracy compared to other machine learning techniques.


What caused what? A quantitative account of actual causation using dynamical causal networks

arXiv.org Artificial Intelligence

Actual causation is concerned with the question "what caused what?" Consider a transition between two states within a system of interacting elements, such as an artificial neural network, or a biological brain circuit. Which combination of synapses caused the neuron to fire? Which image features caused the classifier to misinterpret the picture? Even detailed knowledge of the system's causal network, its elements, their states, connectivity, and dynamics does not automatically provide a straightforward answer to the "what caused what?" question. Counterfactual accounts of actual causation based on graphical models, paired with system interventions, have demonstrated initial success in addressing specific problem cases in line with intuitive causal judgments. Here, we start from a set of basic requirements for causation (realization, composition, information, integration, and exclusion) and develop a rigorous, quantitative account of actual causation that is generally applicable to discrete dynamical systems. We present a formal framework to evaluate these causal requirements that is based on system interventions and partitions, and considers all counterfactuals of a state transition. This framework is used to provide a complete causal account of the transition by identifying and quantifying the strength of all actual causes and effects linking the two consecutive system states. Finally, we examine several exemplary cases and paradoxes of causation and show that they can be illuminated by the proposed framework for quantifying actual causation.


Beyond the EM Algorithm: Constrained Optimization Methods for Latent Class Model

arXiv.org Machine Learning

Latent class model (LCM), which is a finite mixture of different categorical distributions, is one of the most widely used models in statistics and machine learning fields. Because of its non-continuous nature and the flexibility in shape, researchers in practice areas such as marketing and social sciences also frequently use LCM to gain insights from their data. One likelihood-based method, the Expectation-Maximization (EM) algorithm, is often used to obtain the model estimators. However, the EM algorithm is well-known for its notoriously slow convergence. In this research, we explore alternative likelihood-based methods that can potential remedy the slow convergence of the EM algorithm. More specifically, we regard likelihood-based approach as a constrained nonlinear optimization problem, and apply quasi-Newton type methods to solve them. We examine two different constrained optimization methods to maximize the log likelihood function. We present simulation study results to show that the proposed methods not only converge in less iterations than the EM algorithm but also produce more accurate model estimators.


Robust and Adaptive Planning under Model Uncertainty

arXiv.org Artificial Intelligence

Planning under model uncertainty is a fundamental problem across many applications of decision making and learning. In this paper, we propose the Robust Adaptive Monte Carlo Planning (RAMCP) algorithm, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness. RAMCP formulates the risk-sensitive planning problem as a two-player zero-sum game, in which an adversary perturbs the agent's belief over the models. We introduce two versions of the RAMCP algorithm. The first, RAMCP-F, converges to an optimal risk-sensitive policy without having to rebuild the search tree as the underlying belief over models is perturbed. The second version, RAMCP-I, improves computational efficiency at the cost of losing theoretical guarantees, but is shown to yield empirical results comparable to RAMCP-F. RAMCP is demonstrated on an n-pull multi-armed bandit problem, as well as a patient treatment scenario.


Computational Register Analysis and Synthesis

arXiv.org Artificial Intelligence

The study of register in computational language research has historically been divided into register analysis, seeking to determine the registerial character of a text or corpus, and register synthesis, seeking to generate a text in a desired register. This article surveys the different approaches to these disparate tasks. Register synthesis has tended to use more theoretically articulated notions of register and genre than analysis work, which often seeks to categorize on the basis of intuitive and somewhat incoherent notions of prelabeled 'text types'. I argue that an integration of computational register analysis and synthesis will benefit register studies as a whole, by enabling a new large-scale research program in register studies. It will enable comprehensive global mapping of functional language varieties in multiple languages, including the relationships between them. Furthermore, computational methods together with high coverage systematically collected and analyzed data will thus enable rigorous empirical validation and refinement of different theories of register, which will have also implications for our understanding of linguistic variation in general.


Presence-absence estimation in audio recordings of tropical frog communities

arXiv.org Machine Learning

One noninvasive way to study frog communities is by analyzing long-term samples of acoustic material containing calls. This immense task has been optimized by the development of Machine Learning tools to extract ecological information. We explored a likelihood-ratio audio detector based on Gaussian mixture model classification of 10 frog species, and applied it to estimate presence-absence in audio recordings from an actual amphibian monitoring performed at Yasun ı National Park in the Ecuadorian Amazonia. A modified filter-bank was used to extract 20 cepstral features that model the spectral content of frog calls. Experiments were carried out to investigate the hyperparameters and the minimum frog-call time needed to train an accurate GMM classifier. With 64 Gaussians and 12 seconds of training time, the classifier achieved an average weighted error rate of 0.9% on the 10-fold cross-validation for nine species classification, as compared to 3% with MFCC and 1.8% with PLP features. For testing, 10 GMMs were trained using all the available training-validation dataset to study 23.5 hours in 141, 10-minute long samples of unidentified real-world audio recorded at two frog communities in 2001 with analog equipment. To evaluate automatic presence-absence estimation, we characterized the audio samples with 10 binary variables each corresponding to a frog species, and manually labeled a subset of 18 samples using headphones. The one-vs-all Receiver Operating Characteristics curves were used to tune the likelihood-ratio detector per class in order to set operating points that minimize false positives while still allowing moderately noisy calls to be detected. A recall of 87.5% and precision of 100% with average accuracy of 96.66% suggests good generalization ability of the algorithm, and provides evidence of the validity of this approach Finally, we applied the algorithm to the available corpus, and show its potentiality to gain insights into the temporal reproductive behavior of frogs. Introduction In long term ecological studies, it is important to quantify changes that occur on biodiversity and the ecosystem as a whole. Large scale temporal and spatial studies to understand the natural and anthropogenic induced population dynamics are demanded by the scientific community. In addition, recent anuran population declines around the world have motivated studies to gain an understanding of the phenomenon [1].


SNRA: A Spintronic Neuromorphic Reconfigurable Array for In-Circuit Training and Evaluation of Deep Belief Networks

arXiv.org Machine Learning

Abstract--In this paper, a spintronic neuromorphic reconfigurable Array(SNRA) is developed to fuse together power-efficient probabilistic and infield programmable deterministic computing during both training and evaluation phases of restricted Boltzmann machines(RBMs). First, probabilistic spin logic devices are used to develop an RBM realization which is adapted to construct deep belief networks (DBNs) having one to three hidden layers of size 10 to 800 neurons each. The functionality of our proposed CD hardware implementation is validated using ModelSim simulations. We synthesize the developed Verilog HDL implementation of our proposed test/train control circuitry for various DBN topologies where the maximal RBM dimensions yield resource utilization ranging from 51 to 2,421 lookup tables (LUTs). Next, we leverage spin Hall effect (SHE)-magnetic tunnel junction (MTJ) based nonvolatile LUTs circuits as an alternative for static random access memory (SRAM)-based LUTs storing the deterministic logic configuration to form a reconfigurable fabric. Finally, we compare the performance of our proposed SNRA with SRAMbased configurablefabrics focusing on the area and power consumption induced by the LUTs used to implement both CD and evaluation modes. The results obtained indicate more than 80% reduction in combined dynamic and static power dissipation, while achieving at least 50% reduction in device count.


Tree Tensor Networks for Generative Modeling

arXiv.org Machine Learning

Matrix product states (MPS), a tensor network designed for one-dimensional quantum systems, has been recently proposed for generative modeling of natural data (such as images) in terms of `Born machine'. However, the exponential decay of correlation in MPS restricts its representation power heavily for modeling complex data such as natural images. In this work, we push forward the effort of applying tensor networks to machine learning by employing the Tree Tensor Network (TTN) which exhibits balanced performance in expressibility and efficient training and sampling. We design the tree tensor network to utilize the 2-dimensional prior of the natural images and develop sweeping learning and sampling algorithms which can be efficiently implemented utilizing Graphical Processing Units (GPU). We apply our model to random binary patterns and the binary MNIST datasets of handwritten digits. We show that TTN is superior to MPS for generative modeling in keeping correlation of pixels in natural images, as well as giving better log-likelihood scores in standard datasets of handwritten digits. We also compare its performance with state-of-the-art generative models such as the Variational AutoEncoders, Restricted Boltzmann machines, and PixelCNN. Finally, we discuss the future development of Tensor Network States in machine learning problems.


A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference

arXiv.org Machine Learning

Artificial Neural Networks are connectionist systems that perform a given task by learning on examples without having prior knowledge about the task. This is done by finding an optimal point estimate for the weights in every node. Generally, the network using point estimates as weights perform well with large datasets, but they fail to express uncertainty in regions with little or no data, leading to overconfident decisions. In this paper, Bayesian Convolutional Neural Network (BayesCNN) using Variational Inference is proposed, that introduces probability distribution over the weights. Furthermore, the proposed BayesCNN architecture is applied to tasks like Image Classification, Image Super-Resolution and Generative Adversarial Networks. The results are compared to point-estimates based architectures on MNIST, CIFAR-10 and CIFAR-100 datasets for Image CLassification task, on BSD300 dataset for Image Super Resolution task and on CIFAR10 dataset again for Generative Adversarial Network task. BayesCNN is based on Bayes by Backprop which derives a variational approximation to the true posterior. We, therefore, introduce the idea of applying two convolutional operations, one for the mean and one for the variance. Our proposed method not only achieves performances equivalent to frequentist inference in identical architectures but also incorporate a measurement for uncertainties and regularisation. It further eliminates the use of dropout in the model. Moreover, we predict how certain the model prediction is based on the epistemic and aleatoric uncertainties and empirically show how the uncertainty can decrease, allowing the decisions made by the network to become more deterministic as the training accuracy increases. Finally, we propose ways to prune the Bayesian architecture and to make it more computational and time effective.