AITopics | Murray, Iain

Plotting

Murray, Iain

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lossless compression with state space models using bits back coding

Townsend, James, Murray, Iain

arXiv.org Artificial IntelligenceMar-19-2021

We generalize the'bits back with ANS' method to time-series models with a latent Markov structure. This family of models includes hidden Markov models (HMMs), linear Gaussian state space models (LGSSMs) and many more. We provide experimental evidence that our method is effective for small scale models, and discuss its applicability to larger scale settings such as video compression. Recent work by Townsend et al. (2019) shows the existence of a practical method, called'bits back with ANS' (BB-ANS), for doing lossless compression with a latent variable model, at rates close to the negative variational free energy of the model (this quantity bounds the model's marginal log-likelihood and is often referred to as the'evidence lower bound', or ELBO). BB-ANS depends on a last-in-first-out (LIFO) source coding algorithm called Asymmetric Numeral Systems (ANS; Duda, 2009), and also uses an idea called bits back coding (Wallace, 1990; Hinton & van Camp, 1993).

artificial intelligence, compression, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2103.1015

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Density Deconvolution with Normalizing Flows

Dockhorn, Tim, Ritchie, James A., Yu, Yaoliang, Murray, Iain

arXiv.org Machine LearningJul-13-2020

Density deconvolution is the task of estimating a probability density function given only noise-corrupted samples. We can fit a Gaussian mixture model to the underlying density by maximum likelihood if the noise is normally distributed, but would like to exploit the superior density estimation performance of normalizing flows and allow for arbitrary noise distributions. Since both adjustments lead to an intractable likelihood, we resort to amortized variational inference. We demonstrate some problems involved in this approach, however, experiments on real data demonstrate that flows can already out-perform Gaussian mixtures for density deconvolution.

artificial intelligence, neural network, posterior, (12 more...)

arXiv.org Machine Learning

2006.09396

Country: North America > Canada (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Diverse Ensembles Improve Calibration

Stickland, Asa Cooper, Murray, Iain

arXiv.org Machine LearningJul-8-2020

Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched. Training an ensemble of models and averaging their predictions can help alleviate these issues. We propose a simple technique to improve calibration, using a different data augmentation for each ensemble member. We additionally use the idea of `mixing' un-augmented and augmented inputs to improve calibration when test and training distributions are the same. These simple techniques improve calibration and accuracy over strong baselines on the CIFAR10 and CIFAR100 benchmarks, and out-of-domain data from their corrupted versions.

deep learning, ensemble, neural network, (18 more...)

arXiv.org Machine Learning

2007.04206

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Ordering Dimensions with Nested Dropout Normalizing Flows

Bekasov, Artur, Murray, Iain

arXiv.org Machine LearningJun-15-2020

The latent space of normalizing flows must be of the same dimensionality as their output space. This constraint presents a problem if we want to learn low-dimensional, semantically meaningful representations. Recent work has provided compact representations by fitting flows constrained to manifolds, but hasn't defined a density off that manifold. In this work we consider flows with full support in data space, but with ordered latent variables. Like in PCA, the leading latent dimensions define a sequence of manifolds that lie close to the data. We note a trade-off between the flow likelihood and the quality of the ordering, depending on the parameterization of the flow.

artificial intelligence, dimension, machine learning, (16 more...)

arXiv.org Machine Learning

2006.08777

Country:

North America > Canada (0.47)
North America > United States > California (0.28)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Gaussian Process Density Sampler

Murray, Iain, MacKay, David, Adams, Ryan P.

Neural Information Processing SystemsFeb-15-2020, 00:57:53 GMT

We present the Gaussian Process Density Sampler (GPDS), an exchangeable generative model for use in nonparametric Bayesian density estimation. Samples drawn from the GPDS are consistent with exact, independent samples from a fixed density function that is a transformation of a function drawn from a Gaussian process prior. Our formulation allows us to infer an unknown density from data using Markov chain Monte Carlo, which gives samples from the posterior distribution over density functions and from the predictive distribution on data space. We compare this density modeling technique to several existing techniques on a toy problem and a skull-reconstruction task. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, density sampler, machine learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CloudLSTM: A Recurrent Neural Model for Spatiotemporal Point-cloud Stream Forecasting

Zhang, Chaoyun, Fiore, Marco, Murray, Iain, Patras, Paul

arXiv.org Machine LearningJul-29-2019

This paper introduces CloudLSTM, a new branch of recurrent neural network models tailored to forecasting over data streams generated by geospatial point-cloud sources. We design a Dynamic Convolution (D-Conv) operator as the core component of CloudLSTMs, which allows performing convolution operations directly over point-clouds and extracts local spatial features from sets of neighboring points that surround different elements of the input. This maintains the permutation invariance of sequence-to-sequence learning frameworks, while enabling learnable neighboring correlations at each time step -- an important aspect in spatiotemporal predictive learning. The D-Conv operator resolves the grid-structural data requirements of existing spatiotemporal forecasting models (e.g. ConvLSTM) and can be easily plugged into traditional LSTM architectures with sequence-to-sequence learning and attention mechanisms. As a case study, we perform antenna-level forecasting of the data traffic generated by mobile services, demonstrating that the proposed CloudLSTM achieves state-of-the-art performance with measurement datasets collected in operational metropolitan-scale mobile network deployments.

cloudlstm, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1907.1241

Genre: Research Report (0.50)

Industry: Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Spline Flows

Durkan, Conor, Bekasov, Artur, Murray, Iain, Papamakarios, George

arXiv.org Machine LearningJun-10-2019

Flows based on either coupling or autoregressive transforms both offer exact density evaluation and sampling, but rely on the parameterization of an easily invertible elementwise transformation, whose choice determines the flexibility of these models. Building upon recent work, we propose a fully-differentiable module based on monotonic rational-quadratic splines, which enhances the flexibility of both coupling and autoregressive transforms while retaining analytic invertibility. We demonstrate that neural spline flows improve density estimation, variational inference, and generative modeling of images.

artificial intelligence, neural network, transformation, (19 more...)

arXiv.org Machine Learning

1906.04032

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)

Add feedback

Cubic-Spline Flows

Durkan, Conor, Bekasov, Artur, Murray, Iain, Papamakarios, George

arXiv.org Machine LearningJun-5-2019

A normalizing flow models a complex probability density as an invertible transformation of a simple density. The invertibility means that we can evaluate densities and generate samples from a flow. In practice, autoregressive flow-based models are slow to invert, making either density estimation or sample generation slow. Flows based on coupling transforms are fast for both tasks, but have previously performed less well at density estimation than autoregressive flows. We stack a new coupling transform, based on monotonic cubic splines, with LU-decomposed linear layers. The resulting cubic-spline flow retains an exact one-pass inverse, can be used to generate high-quality images, and closes the gap with autoregressive flows on a suite of density-estimation tasks.

artificial intelligence, machine learning, spline, (17 more...)

arXiv.org Machine Learning

1906.02145

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)

Add feedback

Dynamic Evaluation of Transformer Language Models

Krause, Ben, Kahembwe, Emmanuel, Murray, Iain, Renals, Steve

arXiv.org Machine LearningApr-17-2019

This research note combines two methods that have recently improved the state of the art in language modeling: Transformers and dynamic evaluation. Transformers use stacked layers of self-attention that allow them to capture long range dependencies in sequential data. Dynamic evaluation fits models to the recent sequence history, allowing them to assign higher probabilities to reoccurring sequential patterns. By applying dynamic evaluation to Transformer-XL models, we improve the state of the art on enwik8 from 0.99 to 0.94 bits/char, text8 from 1.08 to 1.04 bits/char, and WikiText-103 from 18.3 to 16.4 perplexity points. Language modeling is a commonly used machine learning benchmark with applications to speech recognition, machine translation, text generation, and unsupervised learning in natural language processing tasks.

deep learning, dynamic evaluation, neural network, (17 more...)

arXiv.org Machine Learning

1904.08378

Country: Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Stickland, Asa Cooper, Murray, Iain

arXiv.org Machine LearningFeb-7-2019

Multi-task learning allows the sharing of useful information between multiple related tasks. In natural language processing several recent approaches have successfully leveraged unsupervised pre-training on large amounts of data to perform well on various tasks, such as those in the GLUE benchmark. These results are based on fine-tuning on each task separately. We explore the multi-task learning setting for the recent BERT model on the GLUE benchmark, and how to best add task-specific parameters to a pre-trained BERT network, with a high degree of parameter sharing between tasks. We introduce new adaptation modules, PALs or `projected attention layers', which use a low-dimensional multi-head attention mechanism, based on the idea that it is important to include layers with inductive biases useful for the input domain. By using PALs in parallel with BERT layers, we match the performance of fine-tuned BERT on the GLUE benchmark with roughly 7 times fewer parameters, and obtain state-of-the-art results on the Recognizing Textual Entailment dataset.

attention layer, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1902.02671

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback