Goto

Collaborating Authors

 Learning Graphical Models


5 things you need to know about A.I.: Cognitive, neural and deep, oh my!

#artificialintelligence

There's never any shortage of buzzwords in the IT world, but when it comes to A.I., they can be hard to tell apart. Here are five things you need to understand. Artificial intelligence refers to "a broad set of methods, algorithms and technologies that make software'smart' in a way that may seem human-like to an outside observer," said Lynne Parker, director of the division of Information and Intelligent Systems for the National Science Foundation. Machine learning, computer vision, natural language processing, robotics and related topics are all part of A.I., in other words. "Some people may come up with distinctions between the two, but there is not a universal view that the two terms mean anything different," Parker said.


Kernel Approximation Methods for Speech Recognition

arXiv.org Machine Learning

We study large-scale kernel methods for acoustic modeling in speech recognition and compare their performance to deep neural networks (DNNs). We perform experiments on four speech recognition datasets, including the TIMIT and Broadcast News benchmark tasks, and compare these two types of models on frame-level performance metrics (accuracy, cross-entropy), as well as on recognition metrics (word/character error rate). In order to scale kernel methods to these large datasets, we use the random Fourier feature method of Rahimi and Recht (2007). We propose two novel techniques for improving the performance of kernel acoustic models. First, in order to reduce the number of random features required by kernel models, we propose a simple but effective method for feature selection. The method is able to explore a large number of non-linear features while maintaining a compact model more efficiently than existing approaches. Second, we present a number of frame-level metrics which correlate very strongly with recognition performance when computed on the heldout set; we take advantage of these correlations by monitoring these metrics during training in order to decide when to stop learning. This technique can noticeably improve the recognition performance of both DNN and kernel models, while narrowing the gap between them. Additionally, we show that the linear bottleneck method of Sainath et al. (2013) improves the performance of our kernel models significantly, in addition to speeding up training and making the models more compact. Together, these three methods dramatically improve the performance of kernel acoustic models, making their performance comparable to DNNs on the tasks we explored.


Inferring Cognitive Models from Data using Approximate Bayesian Computation

arXiv.org Machine Learning

An important problem for HCI researchers is to estimate the parameter values of a cognitive model from behavioral data. This is a difficult problem, because of the substantial complexity and variety in human behavioral strategies. We report an investigation into a new approach using approximate Bayesian computation (ABC) to condition model parameters to data and prior knowledge. As the case study we examine menu interaction, where we have click time data only to infer a cognitive model that implements a search behaviour with parameters such as fixation duration and recall probability. Our results demonstrate that ABC (i) improves estimates of model parameter values, (ii) enables meaningful comparisons between model variants, and (iii) supports fitting models to individual users. ABC provides ample opportunities for theoretical HCI research by allowing principled inference of model parameter values and their uncertainty.


Approximation and inference methods for stochastic biochemical kinetics - a tutorial review

arXiv.org Machine Learning

Stochastic fluctuations of molecule numbers are ubiquitous in biological systems. Important examples include gene expression and enzymatic processes in living cells. Such systems are typically modelled as chemical reaction networks whose dynamics are governed by the Chemical Master Equation. Despite its simple structure, no analytic solutions to the Chemical Master Equation are known for most systems. Moreover, stochastic simulations are computationally expensive, making systematic analysis and statistical inference a challenging task. Consequently, significant effort has been spent in recent decades on the development of efficient approximation and inference methods. This article gives an introduction to basic modelling concepts as well as an overview of state of the art methods. First, we motivate and introduce deterministic and stochastic methods for modelling chemical networks, and give an overview of simulation and exact solution methods. Next, we discuss several approximation methods, including the chemical Langevin equation, the system size expansion, moment closure approximations, time-scale separation approximations and hybrid methods. We discuss their various properties and review recent advances and remaining challenges for these methods. We present a comparison of several of these methods by means of a numerical case study and highlight some of their respective advantages and disadvantages. Finally, we discuss the problem of inference from experimental data in the Bayesian framework and review recent methods developed the literature. In summary, this review gives a self-contained introduction to modelling, approximations and inference methods for stochastic chemical kinetics.


Manifold Alignment Determination: finding correspondences across different data views

arXiv.org Machine Learning

The approach is capable of learning correspondences between views as well as correspondences between individual data-points. The proposed method requires only a few aligned examples from which it is capable to recover a global alignment through a probabilistic model. The strong, yet flexible regularization provided by the generative model is sufficient to align the views. We provide experiments on both synthetic and real data to highlight the benefit of the proposed approach.


Relaxation of the EM Algorithm via Quantum Annealing for Gaussian Mixture Models

arXiv.org Machine Learning

We propose a modified expectation-maximization algorithm by introducing the concept of quantum annealing, which we call the deterministic quantum annealing expectation-maximization (DQAEM) algorithm. The expectation-maximization (EM) algorithm is an established algorithm to compute maximum likelihood estimates and applied to many practical applications. However, it is known that EM heavily depends on initial values and its estimates are sometimes trapped by local optima. To solve such a problem, quantum annealing (QA) was proposed as a novel optimization approach motivated by quantum mechanics. By employing QA, we then formulate DQAEM and present a theorem that supports its stability. Finally, we demonstrate numerical simulations to confirm its efficiency.


Bayesian Non-Homogeneous Markov Models via Polya-Gamma Data Augmentation with Applications to Rainfall Modeling

arXiv.org Machine Learning

Discrete-time hidden Markov models are a broadly useful class of latent-variable models with applications in areas such as speech recognition, bioinformatics, and climate data analysis. It is common in practice to introduce temporal non-homogeneity into such models by making the transition probabilities dependent on time-varying exogenous input variables via a multinomial logistic parametrization. We extend such models to introduce additional non-homogeneity into the emission distribution using a generalized linear model (GLM), with data augmentation for sampling-based inference. However, the presence of the logistic function in the state transition model significantly complicates parameter inference for the overall model, particularly in a Bayesian context. To address this we extend the recently-proposed Polya-Gamma data augmentation approach to handle non-homogeneous hidden Markov models (NHMMs), allowing the development of an efficient Markov chain Monte Carlo (MCMC) sampling scheme. We apply our model and inference scheme to 30 years of daily rainfall in India, leading to a number of insights into rainfall-related phenomena in the region. Our proposed approach allows for fully Bayesian analysis of relatively complex NHMMs on a scale that was not possible with previous methods. Software implementing the methods described in the paper is available via the R package NHMM.


Improving Sampling from Generative Autoencoders with Markov Chains

arXiv.org Machine Learning

We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model. Generative autoencoders are those which are trained to softly enforce a prior on the latent distribution learned by the inference model. We call the distribution to which the inference model maps observed samples, the learned latent distribution, which may not be consistent with the prior. We formulate a Markov chain Monte Carlo (MCMC) sampling process, equivalent to iteratively decoding and encoding, which allows us to sample from the learned latent distribution. Since, the generative model learns to map from the learned latent distribution, rather than the prior, we may use MCMC to improve the quality of samples drawn from the generative model, especially when the learned latent distribution is far from the prior. Using MCMC sampling, we are able to reveal previously unseen differences between generative autoencoders trained either with or without a denoising criterion.


Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

arXiv.org Machine Learning

The focus in this paper is Bayesian system identification based on noisy incomplete modal data where we can impose spatially-sparse stiffness changes when updating a structural model. To this end, based on a similar hierarchical sparse Bayesian learning model from our previous work, we propose two Gibbs sampling algorithms. The algorithms differ in their strategies to deal with the posterior uncertainty of the equation-error precision parameter, but both sample from the conditional posterior probability density functions (PDFs) for the structural stiffness parameters and system modal parameters. The effective dimension for the Gibbs sampling is low because iterative sampling is done from only three conditional posterior PDFs that correspond to three parameter groups, along with sampling of the equation-error precision parameter from another conditional posterior PDF in one of the algorithms where it is not integrated out as a "nuisance" parameter. A nice feature from a computational perspective is that it is not necessary to solve a nonlinear eigenvalue problem of a structural model. The effectiveness and robustness of the proposed algorithms are illustrated by applying them to the IASE-ASCE Phase II simulated and experimental benchmark studies. The goal is to use incomplete modal data identified before and after possible damage to detect and assess spatially-sparse stiffness reductions induced by any damage. Our past and current focus on meeting challenges arising from Bayesian inference of structural stiffness serve to strengthen the capability of vibration-based structural system identification but our methods also have much broader applicability for inverse problems in science and technology where system matrices are to be inferred from noisy partial information about their eigenquantities.


Machine Learning Algorithms for Business Applications - Complete Guide -

#artificialintelligence

With the development of free, open-source machine learning and artificial intelligence tools like Google's TensorFlow and sci-kit learn, as well as "ML-as-a-service" products like Google's Cloud Prediction API and Microsoft's Azure Machine Learning platform, it's never been easier for companies of all sizes to harness the power of data. But machine learning is such a vast, complex field. Where do you start learning how to use it in your business? In this article, we'll survey the current landscape of machine learning algorithms and explain how they work, provide example applications, share how other companies use them, and provide further resources on learning about them. This executive overview will provide the first step in learning how to apply machine learning algorithm(s) to make your business more efficient, more effective, and more profitable.