AITopics

1610.03113

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Paananen, Topi, Piironen, Juho, Andersen, Michael Riis, Vehtari, Aki

Model selection for Gaussian processes utilizing sensitivity of posterior predictive distribution

arXiv.org Machine LearningDec-21-2017

We propose two novel methods for simplifying Gaussian process (GP) models by examining the predictions of a full model in the vicinity of the training points and thereby ordering the covariates based on their predictive relevance. Our results on synthetic and real world data sets demonstrate improved variable selection compared to automatic relevance determination (ARD) in terms of consistency and predictive performance. We expect our proposed methods to be useful in interpreting and understanding complex Gaussian process models.

artificial intelligence, covariate, machine learning, (19 more...)

1712.08048

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Donner, Christian, Opper, Manfred

Inverse Ising problem in continuous time: A latent variable approach

arXiv.org Machine LearningDec-21-2017

In recent years, the inverse Ising problem, i.e. the reconstruction of couplings and external fields of an Ising model from samples of spin configurations, has attracted considerable interest in the physics community [1]. This is due to the fact that Ising models play an important role for data modeling with applications to neural spike data [2, 3], protein structure determination [4], and gene expression analysis [5]. Much effort has been devoted to the development of algorithms for the static inverse Ising problem. This is a nontrivial task, because statistically efficient, likelihood based methods become computationally infeasible by the intractability of the partition function of the model. Hence one has to resort to either approximate inference methods or to other statistical estimators such as pseudo-likelihood methods [6], or the interaction screening algorithm [7]. The situation is somewhat simpler for the dynamical inverse Ising problem, which recently attracted attention [8-13]. If one assumes a Markovian dynamics, the exact normalisation of the spin transition probabilities allows for an explicit computation of the likelihood if one has a complete set of observed data over time. Nevertheless, the model parameters enter the likelihood in a fairly complex way, and the application of more advanced statistical approaches such as Bayesian inference again becomes a nontrivial task. This is especially true for the continuous time kinetic Ising model where the spins are governed by Glauber dynamics [14].

artificial intelligence, bayesian inference, machine learning, (15 more...)

doi: 10.1103/PhysRevE.96.062104

1709.04495

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Lee, Kevin H., Xue, Lingzhou, Hunter, David R.

Model-Based Clustering of Time-Evolving Networks through Temporal Exponential-Family Random Graph Models

arXiv.org Machine LearningDec-20-2017

Dynamic networks are a general language for describing time-evolving complex systems, and discrete time network models provide an emerging statistical technique for various applications. It is a fundamental research question to detect the community structure in time-evolving networks. However, due to significant computational challenges and difficulties in modeling communities of time-evolving networks, there is little progress in the current literature to effectively find communities in time-evolving networks. In this work, we propose a novel model-based clustering framework for time-evolving networks based on discrete time exponential-family random graph models. To choose the number of communities, we use conditional likelihood to construct an effective model selection criterion. Furthermore, we propose an efficient variational expectation-maximization (EM) algorithm to find approximate maximum likelihood estimates of network parameters and mixing proportions. By using variational methods and minorization-maximization (MM) techniques, our method has appealing scalability for large-scale time-evolving networks. The power of our method is demonstrated in simulation studies and empirical applications to international trade networks and the collaboration networks of a large American research university.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1712.07325

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Martino, Luca, Elvira, Victor, Camps-Valls, Gustau

The Recycling Gibbs Sampler for Efficient Learning

arXiv.org Machine LearningDec-20-2017

Monte Carlo methods are essential tools for Bayesian inference. Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning, and statistics, employed to draw samples from complicated high-dimensional posterior distributions. The key point for the successful application of the Gibbs sampler is the ability to draw efficiently samples from the full-conditional probability density functions. Since in the general case this is not possible, in order to speed up the convergence of the chain, it is required to generate auxiliary samples whose information is eventually disregarded. In this work, we show that these auxiliary samples can be recycled within the Gibbs estimators, improving their efficiency with no extra cost. This novel scheme arises naturally after pointing out the relationship between the standard Gibbs sampler and the chain rule used for sampling purposes. Numerical simulations involving simple and real inference problems confirm the excellent performance of the proposed scheme in terms of accuracy and computational efficiency. In particular we give empirical evidence of performance in a toy example, inference of Gaussian processes hyperparameters, and learning dependence graphs through regression.

artificial intelligence, machine learning, sampler, (18 more...)

doi: 10.1016/j.dsp.2017.11.012

1611.07056

Country: Europe > Spain (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior

Murugan, Pushparaja

Convolutional Neural Network is known as ConvNet have been extensively used in many complex machine learning tasks. However, hyperparameters optimization is one of a crucial step in developing ConvNet architectures, since the accuracy and performance are totally reliant on the hyperparameters. This multilayered architecture parameterized by a set of hyperparameters such as the number of convolutional layers, number of fully connected dense layers & neurons, the probability of dropout implementation, learning rate. Hence the searching the hyperparameter over the hyperparameter space are highly difficult to build such complex hierarchical architecture. Many methods have been proposed over the decade to explore the hyperparameter space and find the optimum set of hyperparameter values. Reportedly, Gird search and Random search are said to be inefficient and extremely expensive, due to a large number of hyperparameters of the architecture. Hence, Sequential model-based Bayesian Optimization is a promising alternative technique to address the extreme of the unknown cost function. The recent study on Bayesian Optimization by Snoek in nine convolutional network parameters is achieved the lowerest error report in the CIFAR-10 benchmark. This article is intended to provide the overview of the mathematical concept behind the Bayesian Optimization over a Gaussian prior.

artificial intelligence, machine learning, optimization, (17 more...)

1712.07233

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.83)

Pavlichin, Dmitri S., Jiao, Jiantao, Weissman, Tsachy

Approximate Profile Maximum Likelihood

We propose an efficient algorithm for approximate computation of the profile maximum likelihood (PML), a variant of maximum likelihood maximizing the probability of observing a sufficient statistic rather than the empirical sample. The PML has appealing theoretical properties, but is difficult to compute exactly. Inspired by observations gleaned from exactly solvable cases, we look for an approximate PML solution, which, intuitively, clumps comparably frequent symbols into one symbol. This amounts to lower-bounding a certain matrix permanent by summing over a subgroup of the symmetric group rather than the whole group during the computation. We extensively experiment with the approximate solution, and find the empirical performance of our approach is competitive and sometimes significantly better than state-of-the-art performance for various estimation problems.

artificial intelligence, bayesian inference, machine learning, (20 more...)

1712.07177

Country: North America > United States (0.67)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Mattsson, Per, Zachariah, Dave, Stoica, Petre

Recursive nonlinear-system identification using latent variables

In this paper we develop a method for learning nonlinear systems with multiple outputs and inputs. We begin by modelling the errors of a nominal predictor of the system using a latent variable framework. Then using the maximum likelihood principle we derive a criterion for learning the model. The resulting optimization problem is tackled using a majorization-minimization approach. Finally, we develop a convex majorization technique and show that it enables a recursive identification method. The method learns parsimonious predictive models and is tested on both synthetic and real nonlinear systems.

artificial intelligence, identification, machine learning, (18 more...)

1606.04366

Country: Europe > Sweden (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Levy, William B, Berger, Toby, Sungkar, Mustafa

Neural computation from first principles: Using the maximum entropy method to obtain an optimal bits-per-joule neuron

Optimization results are one method for understanding neural computation from Nature's perspective and for defining the physical limits on neuron-like engineering. Earlier work looks at individual properties or performance criteria and occasionally a combination of two, such as energy and information. Here we make use of Jaynes' maximum entropy method and combine a larger set of constraints, possibly dimensionally distinct, each expressible as an expectation. The method identifies a likelihood-function and a sufficient statistic arising from each such optimization. This likelihood is a first-hitting time distribution in the exponential class. Particular constraint sets are identified that, from an optimal inference perspective, justify earlier neurocomputational models. Interactions between constraints, mediated through the inferred likelihood, restrict constraint-set parameterizations, e.g., the energy-budget limits estimation performance which, in turn, matches an axonal communication constraint. Such linkages are, for biologists, experimental predictions of the method. In addition to the related likelihood, at least one type of constraint set implies marginal distributions, and in this case, a Shannon bits/joule statement arises.

artificial intelligence, constraint, machine learning, (16 more...)

1606.03063

Country: North America > United States > Virginia (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Riesselman, Adam J., Ingraham, John B., Marks, Debora S.

Deep generative models of genetic variation capture mutation effects

arXiv.org Machine LearningDec-18-2017

The functions of proteins and RNAs are determined by a myriad of interactions between their constituent residues, but most quantitative models of how molecular phenotype depends on genotype must approximate this by simple additive effects. While recent models have relaxed this constraint to also account for pairwise interactions, these approaches do not provide a tractable path towards modeling higher-order dependencies. Here, we show how latent variable models with nonlinear dependencies can be applied to capture beyond-pairwise constraints in biomolecules. We present a new probabilistic model for sequence families, DeepSequence, that can predict the effects of mutations across a variety of deep mutational scanning experiments significantly better than site independent or pairwise models that are based on the same evolutionary data. The model, learned in an unsupervised manner solely from sequence information, is grounded with biologically motivated priors, reveals latent organization of sequence families, and can be used to extrapolate to new parts of sequence space.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1712.06527

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)