AITopics | van der Wilk, Mark

Collaborating Authors

van der Wilk, Mark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Invariant Weights in Neural Networks

van der Ouderaa, Tycho F. A., van der Wilk, Mark

arXiv.org Artificial IntelligenceAug-2-2022

Assumptions about invariances or symmetries in data can significantly increase the predictive power of statistical models. Many commonly used models in machine learning are constraint to respect certain symmetries in the data, such as translation equivariance in convolutional neural networks, and incorporation of new symmetry types is actively being studied. Yet, efforts to learn such invariances from the data itself remains an open research problem. It has been shown that marginal likelihood offers a principled way to learn invariances in Gaussian Processes. We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks resulting in naturally higher performing models.

artificial intelligence, invariance, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2202.12439

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Memory Safe Computations with XLA Compiler

Artemev, Artem, Roeder, Tilman, van der Wilk, Mark

arXiv.org Machine LearningJun-28-2022

Software packages like TensorFlow and PyTorch are designed to support linear algebra operations, and their speed and usability determine their success. However, by prioritising speed, they often neglect memory requirements. As a consequence, the implementations of memory-intensive algorithms that are convenient in terms of software design can often not be run for large problems due to memory overflows. Memory-efficient solutions require complex programming approaches with significant logic outside the computational framework. This impairs the adoption and use of such algorithms. To address this, we developed an XLA compiler extension that adjusts the computational data-flow representation of an algorithm according to a user-specified memory limit. We show that k-nearest neighbour and sparse Gaussian process regression methods can be run at a much larger scale on a single device, where standard implementations would have failed. Our approach leads to better use of hardware resources. We believe that further focus on removing memory constraints at a compiler level will widen the range of machine learning methods that can be developed in the future.

artificial intelligence, implementation, machine learning, (16 more...)

arXiv.org Machine Learning

2206.14148

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.34)

Add feedback

Barely Biased Learning for Gaussian Process Regression

Burt, David R., Artemev, Artem, van der Wilk, Mark

arXiv.org Machine LearningSep-20-2021

Recent work in scalable approximate Gaussian process regression has discussed a bias-variance-computation trade-off when estimating the log marginal likelihood. We suggest a method that adaptively selects the amount of computation to use when estimating the log marginal likelihood so that the bias of the objective function is guaranteed to be small. While simple in principle, our current implementation of the method is not competitive computationally with existing approximations.

artificial intelligence, log marginal likelihood, machine learning, (12 more...)

arXiv.org Machine Learning

2109.09417

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Modeling & Simulation (0.64)

Add feedback

A Bayesian Approach to Invariant Deep Neural Networks

Mourdoukoutas, Nikolaos, Federici, Marco, Pantalos, Georges, van der Wilk, Mark, Fortuin, Vincent

arXiv.org Machine LearningJul-20-2021

Contributions We propose a method to learn such weight-sharing schemes from data. As a proof of concept, we focus on being invariant We propose a novel Bayesian neural network architecture to two types of transformations applied on images, that can learn invariances from data namely rotations and flips. However, our algorithm can be alone by inferring a posterior distribution over applied to any other choice of symmetry, as long as the corresponding different weight-sharing schemes. We show that weight-sharing scheme is available. Apart from our model outperforms other non-invariant architectures, achieving good performance during inference, our model is when trained on datasets that contain able to learn such invariances from data. This is achieved by specific invariances. The same holds true when specifying a probability distribution over the weight-sharing no data augmentation is performed.

deep learning, invariance, neural network, (17 more...)

arXiv.org Machine Learning

2107.09301

Country:

North America > United States > New York (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.83)

Add feedback

Last Layer Marginal Likelihood for Invariance Learning

Schwöbel, Pola Elisabeth, Jørgensen, Martin, Ober, Sebastian W., van der Wilk, Mark

arXiv.org Machine LearningJun-14-2021

Data augmentation is often used to incorporate inductive biases into models. Traditionally, these are hand-crafted and tuned with cross validation. The Bayesian paradigm for model selection provides a path towards end-to-end learning of invariances using only the training data, by optimising the marginal likelihood. We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer, a model for which the marginal likelihood can be computed. Experimentally, we improve performance by learning appropriate invariances in standard benchmarks, the low data regime and in a medical imaging task. Optimisation challenges for invariant Deep Kernel Gaussian processes are identified, and a systematic analysis is presented to arrive at a robust training scheme. We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions than before, thereby overcoming some of the training challenges that existed with previous approaches.

deep learning, likelihood, neural network, (19 more...)

arXiv.org Machine Learning

2106.07512

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Data augmentation in Bayesian neural networks and the cold posterior effect

Nabarro, Seth, Ganev, Stoil, Garriga-Alonso, Adrià, Fortuin, Vincent, van der Wilk, Mark, Aitchison, Laurence

arXiv.org Machine LearningJun-10-2021

Data augmentation is a highly effective approach for improving performance in deep neural networks. The standard view is that it creates an enlarged dataset by adding synthetic data, which raises a problem when combining it with Bayesian inference: how much data are we really conditioning on? This question is particularly relevant to recent observations linking data augmentation to the cold posterior effect. We investigate various principled ways of finding a log-likelihood for augmented datasets. Our approach prescribes augmenting the same underlying image multiple times, both at test and train-time, and averaging either the logits or the predictive probabilities. Empirically, we observe the best performance with averaging probabilities. While there are interactions with the cold posterior effect, neither averaging logits or averaging probabilities eliminates it.

augmentation, deep learning, neural network, (15 more...)

arXiv.org Machine Learning

2106.05586

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

BNNpriors: A library for Bayesian neural network inference with different prior distributions

Fortuin, Vincent, Garriga-Alonso, Adrià, van der Wilk, Mark, Aitchison, Laurence

arXiv.org Machine LearningMay-14-2021

Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs well and can lead to suboptimal performance. Our new library, BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks with a wide range of predefined priors, including heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows a modular approach that eases the design and implementation of new custom priors. It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks and will hopefully catalyze future research as well as practical applications in this area.

deep learning, library, neural network, (13 more...)

arXiv.org Machine Learning

doi: 10.1016/j.simpa.2021.100079

2105.06964

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Add feedback

Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Dutordoir, Vincent, Hensman, James, van der Wilk, Mark, Ek, Carl Henrik, Ghahramani, Zoubin, Durrande, Nicolas

arXiv.org Machine LearningMay-10-2021

Bayesian inference has the potential to improve deep neural networks (DNNs) by providing 1) uncertainty estimates for robust prediction and downstream decision-making, and 2) an objective function (the marginal likelihood) for hyperparameter selection [MacKay, 1992a; 1992b; 2003]. The recent success of deep learning [Krizhevsky et al., 2012; Vaswani et al., 2017; Schrittwieser et al., 2020] has renewed interest in large-scale Bayesian Neural Networks (BNNs) as well, with effort mainly focused on obtaining useful uncertainty estimates [Blundell et al., 2015; Kingma et al., 2015; Gal and Ghahramani, 2016]. Despite already providing usable uncertainty estimates, there is significant evidence that current approximations to the uncertainty on neural network weights can still be significantly improved [Hron et al., 2018; Foong et al., 2020]. The accuracy of the uncertainty approximation is also linked to the quality of the marginal likelihood estimate [Blei et al., 2017]. Since hyperparameter learning using the marginal likelihood fails for most common approximations [e.g., Blundell et al., 2015], the accuracy of the uncertainty estimates is also questionable. Damianou and Lawrence [2013] used Gaussian processes [Rasmussen and Williams, 2006] as layers to create a different Bayesian analogue to a DNN: the Deep Gaussian process (DGP). Gaussian processes (GPs) are a different representation of a single layer neural network, which is promising because it allows high-quality approximations to uncertainty [Titsias, 2009; Burt et al., 2019].

deep learning, gaussian process, neural network, (17 more...)

arXiv.org Machine Learning

2105.04504

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

GPflux: A Library for Deep Gaussian Processes

Dutordoir, Vincent, Salimbeni, Hugh, Hambro, Eric, McLeod, John, Leibfried, Felix, Artemev, Artem, van der Wilk, Mark, Hensman, James, Deisenroth, Marc P., John, ST

arXiv.org Machine LearningApr-12-2021

We introduce GPflux, a Python library for Bayesian deep learning with a strong emphasis on deep Gaussian processes (DGPs). Implementing DGPs is a challenging endeavour due to the various mathematical subtleties that arise when dealing with multivariate Gaussian distributions and the complex bookkeeping of indices. To date, there are no actively maintained, open-sourced and extendable libraries available that support research activities in this area. GPflux aims to fill this gap by providing a library with state-of-the-art DGP algorithms, as well as building blocks for implementing novel Bayesian and GP-based hierarchical models and inference schemes. GPflux is compatible with and built on top of the Keras deep learning eco-system. This enables practitioners to leverage tools from the deep learning community for building and training customised Bayesian models, and create hierarchical models that consist of Bayesian and standard neural network layers in a single coherent framework. GPflux relies on GPflow for most of its GP objects and operations, which makes it an efficient, modular and extensible library, while having a lean codebase.

artificial intelligence, deep gaussian process, neural network, (3 more...)

arXiv.org Machine Learning

2104.05674

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

The Promises and Pitfalls of Deep Kernel Learning

Ober, Sebastian W., Rasmussen, Carl E., van der Wilk, Mark

arXiv.org Machine LearningFeb-24-2021

Deep kernel learning and related techniques promise to combine the representational power of neural networks with the reliable uncertainty estimates of Gaussian processes. One crucial aspect of these models is an expectation that, because they are treated as Gaussian process models optimized using the marginal likelihood, they are protected from overfitting. However, we identify pathological behavior, including overfitting, on a simple toy example. We explore this pathology, explaining its origins and considering how it applies to real datasets. Through careful experimentation on UCI datasets, CIFAR-10, and the UTKFace dataset, we find that the overfitting from overparameterized deep kernel learning, in which the model is "somewhat Bayesian", can in certain scenarios be worse than that from not being Bayesian at all. However, we find that a fully Bayesian treatment of deep kernel learning can rectify this overfitting and obtain the desired performance improvements over standard neural networks and Gaussian processes.

dataset, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2102.12108

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Add feedback