AITopics | Directed Networks

Collaborating Authors

Directed Networks

News Overviews Instructional Materials AI-Alerts Classics

Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification

Milios, Dimitrios, Camoriano, Raffaello, Michiardi, Pietro, Rosasco, Lorenzo, Filippone, Maurizio

arXiv.org Machine LearningMay-28-2018

In this paper, we study the problem of deriving fast and accurate classification algorithms with uncertainty quantification. Gaussian process classification provides a principled approach, but the corresponding computational burden is hardly sustainable in large-scale problems and devising efficient alternatives is a challenge. In this work, we investigate if and how Gaussian process regression directly applied to the classification labels can be used to tackle this question. While in this case training time is remarkably faster, predictions need be calibrated for classification and uncertainty estimation. To this aim, we propose a novel approach based on interpreting the labels as the output of a Dirichlet distribution. Extensive experimental results show that the proposed approach provides essentially the same accuracy and uncertainty quantification of Gaussian process classification while requiring only a fraction of computational resources.

artificial intelligence, ece 0, machine learning, (18 more...)

arXiv.org Machine Learning

1805.10915

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Parallel Weight Consolidation: A Brain Segmentation Case Study

McClure, Patrick, Zheng, Charles, Pereira, Francisco, Kaczmarzyk, Jakub, Rogers-Lee, John, Nielson, Dylan, Bandettini, Peter

arXiv.org Machine LearningMay-28-2018

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed datasets and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce parallel weight consolidation (PWC), a continual learning method to consolidate the weights of neural networks trained in parallel on independent datasets. We perform a brain segmentation case study using PWC to consolidate several dilated convolutional neural networks trained in parallel on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that PWC led to increased performance on held-out test sets from the different sites, as well as on a very large and completely independent multi-site dataset. This demonstrates the feasibility of PWC for combining the knowledge learned by networks trained on different datasets.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

1805.10863

Country: North America > United States (0.47)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.88)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Bayesian Learning with Wasserstein Barycenters

Rios, Gonzalo, Backhoff-Veraguas, Julio, Fontbona, Joaquin, Tobar, Felipe

arXiv.org Machine LearningMay-28-2018

Methods in Economics TU Vienna In this work we introduce a novel paradigm for Bayesian learning based on optimal transport theory. Namely, we propose to use the Wasserstein barycenter of the posterior law on models, as an alternative to the maximum a posteriori estimator (MAP) and Bayes predictive distributions. We exhibit conditions granting the existence and consistency of this estimator, discuss some of its basic and specific properties, and propose a numerical approximation relying on standard posterior sampling in general finite-dimensional parameter spaces. We thus also contribute to the recent blooming of applications of optimal transport theory in machine learning, beyond the discrete and semidiscrete settings so far considered. Advantages of the proposed estimator are discussed and illustrated with numerical simulations.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1805.10833

Country:

North America > United States (0.28)
Europe > Austria > Vienna (0.24)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Frequentists Fight Back

@machinelearnbotMay-27-2018, 21:16:37 GMT

Frequentist-leaning statisticians have numerous responses to Bayesian criticisms that may not be widely known. Broadly speaking, these rebuttals assert that Bayesian criticisms of Frequentist approaches rely on circular arguments, are self-refuting, rest mostly on semantics, or are mainly of interest to academics and irrelevant in practice. Below, I've briefly summarized the ones I'm aware of from memory and in my own words. The meaning of the term is often unclear. Is it objective Bayes, subjective Bayes, approximate Bayes, empirical Bayes, or all of the above?

artificial intelligence, bayesian inference, machine learning, (15 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Boosting Uncertainty Estimation for Deep Neural Classifiers

Geifman, Yonatan, Uziel, Guy, El-Yaniv, Ran

arXiv.org Machine LearningMay-27-2018

We consider the problem of uncertainty estimation in the context of (non-Bayesian) deep neural classification. All current methods are based on extracting uncertainty signals from a trained network optimized to solve the classification problem at hand. We demonstrate that such techniques tend to misestimate instances whose predictions are supposed to be highly confident. This deficiency is an artifact of the training process with SGD-like optimizers. Based on this observation, we develop an uncertainty estimation algorithm that "peels away" highly confident points sequentially and estimates their confidence using earlier snapshots of the trained model, before their uncertainty estimates are jittered. We present extensive experiments indicating that the proposed algorithm provides uncertainty estimates that are consistently better than the best known methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1805.08206

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Maximizing acquisition functions for Bayesian optimization

Wilson, James T., Hutter, Frank, Deisenroth, Marc Peter

arXiv.org Machine LearningMay-25-2018

Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide the search process. Fully maximizing acquisition functions produces the Bayes' decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We present two modern approaches for maximizing acquisition functions that exploit key properties thereof, namely the differentiability of Monte Carlo integration and the submodularity of parallel querying.

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Machine Learning

1805.10196

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Bayesian estimation for large scale multivariate Ornstein-Uhlenbeck model of brain connectivity

Insabato, Andrea, Cunningham, John P., Gilson, Matthieu

arXiv.org Machine LearningMay-25-2018

Estimation of reliable whole-brain connectivity is a crucial step towards the use of connectivity information in quantitative approaches to the study of neuropsychiatric disorders. When estimating brain connectivity a challenge is imposed by the paucity of time samples and the large dimensionality of the measurements. Bayesian estimation methods for network models offer a number of advantages in this context but are not commonly employed. Here we compare three different estimation methods for the multivariate Ornstein-Uhlenbeck model, that has recently gained some popularity for characterizing whole-brain connectivity. We first show that a Bayesian estimation of model parameters assuming uniform priors is equivalent to an application of the method of moments. Then, using synthetic data, we show that the Bayesian estimate scales poorly with number of nodes in the network as compared to an iterative Lyapunov optimization. In particular when the network size is in the order of that used for whole-brain studies (about 100 nodes) the Bayesian method needs about eight times more time samples than Lyapunov method in order to achieve similar estimation accuracy. We also show that the higher estimation accuracy of Lyapunov method is reflected in a much better classification of individuals based on the estimated connectivity from a real dataset of BOLD fMRI. Finally we show that the poor accuracy of Bayesian method is due to numerical errors, when the imaginary part of the connectivity estimate gets large compared to its real part.

artificial intelligence, connectivity, machine learning, (18 more...)

arXiv.org Machine Learning

1805.1005

Country: North America > United States > New York (0.15)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Comparative Study of Classification Techniques in Data Mining Algorithms

@machinelearnbotMay-24-2018, 22:22:13 GMT

Classification is used to find out in which group each data instance is related within a given dataset. It is used for classifying data into different classes according to some constrains. Several major kinds of classification algorithms including C4.5, ID3, k-nearest neighbor classifier, Naive Bayes, SVM, and ANN are used for classification. Generally a classification technique follows three approaches Statistical, Machine Learning and Neural Network for classification. While considering these approaches this paper provides an inclusive survey of different classification algorithms and their features and limitations.

algorithm, artificial intelligence, machine learning, (16 more...)

@machinelearnbot

Country: Asia > India (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

New Insights into Bootstrapping for Bandits

Vaswani, Sharan, Kveton, Branislav, Wen, Zheng, Rao, Anup, Schmidt, Mark, Abbasi-Yadkori, Yasin

arXiv.org Machine LearningMay-24-2018

We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal regret. As an alternative to NPB, we propose a weighted bootstrapping (WB) procedure. For Bernoulli rewards, WB with multiplicative exponential weights is mathematically equivalent to Thompson sampling (TS) and results in near-optimal regret bounds. Similarly, in the bandit setting with Gaussian rewards, we show that WB with additive Gaussian weights achieves near-optimal regret. Beyond these special cases, we show that WB leads to better empirical performance than TS for several reward distributions bounded on $[0,1]$. For the contextual bandit setting, we give practical guidelines that make bootstrapping simple and efficient to implement and result in good empirical performance on real-world datasets.

artificial intelligence, machine learning, procedure, (17 more...)

arXiv.org Machine Learning

1805.09793

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Towards Robust Evaluations of Continual Learning

Farquhar, Sebastian, Gal, Yarin

arXiv.org Machine LearningMay-24-2018

Continual learning experiments used in current deep learning papers do not faithfully assess fundamental challenges of learning continually, masking weak-points of the suggested approaches instead. We study gaps in such existing evaluations, proposing essential experimental evaluations that are more representative of continual learning's challenges, and suggest a re-prioritization of research efforts in the field. We show that current approaches fail with our new evaluations and, to analyse these failures, we propose a variational loss which unifies many existing solutions to continual learning under a Bayesian framing, as either 'prior-focused' or 'likelihood-focused'. We show that while prior-focused approaches such as EWC and VCL perform well on existing evaluations, they perform dramatically worse when compared to likelihood-focused approaches on other simple tasks.

artificial intelligence, machine learning, mnist, (18 more...)

arXiv.org Machine Learning

1805.09733

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback