AITopics | Smith, Lewis

Collaborating Authors

Smith, Lewis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Dictionary Learning with Gated Sparse Autoencoders

Rajamanoharan, Senthooran, Conmy, Arthur, Smith, Lewis, Lieberum, Tom, Varma, Vikrant, Kramár, János, Shah, Rohin, Nanda, Neel

arXiv.org Artificial IntelligenceApr-30-2024

Recent work has found that sparse autoencoders (SAEs) are an effective technique for unsupervised discovery of interpretable features in language models' (LMs) activations, by finding sparse, linear reconstructions of LM activations. We introduce the Gated Sparse Autoencoder (Gated SAE), which achieves a Pareto improvement over training with prevailing methods. In SAEs, the L1 penalty used to encourage sparsity introduces many undesirable biases, such as shrinkage -- systematic underestimation of feature activations. The key insight of Gated SAEs is to separate the functionality of (a) determining which directions to use and (b) estimating the magnitudes of those directions: this enables us to apply the L1 penalty only to the former, limiting the scope of undesirable side effects. Through training SAEs on LMs of up to 7B parameters we find that, in typical hyper-parameter ranges, Gated SAEs solve shrinkage, are similarly interpretable, and require half as many firing features to achieve comparable reconstruction fidelity.

artificial intelligence, machine learning, sae, (19 more...)

arXiv.org Artificial Intelligence

2404.16014

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Can convolutional ResNets approximately preserve input distances? A frequency analysis perspective

Smith, Lewis, van Amersfoort, Joost, Huang, Haiwen, Roberts, Stephen, Gal, Yarin

arXiv.org Machine LearningJun-17-2021

ResNets constrained to be bi-Lipschitz, that is, approximately distance preserving, have been a crucial component of recently proposed techniques for deterministic uncertainty quantification in neural models. We show that theoretical justifications for recent regularisation schemes trying to enforce such a constraint suffer from a crucial flaw -- the theoretical link between the regularisation scheme used and bi-Lipschitzness is only valid under conditions which do not hold in practice, rendering existing theory of limited use, despite the strong empirical performance of these models. We provide a theoretical explanation for the effectiveness of these regularisation schemes using a frequency analysis perspective, showing that under mild conditions these schemes will enforce a lower Lipschitz bound on the low-frequency projection of images. We then provide empirical evidence supporting our theoretical claims, and perform further experiments which demonstrate that our broader conclusions appear to hold when some of the mathematical assumptions of our proof are relaxed, corresponding to the setup used in prior work. In addition, we present a simple constructive algorithm to search for counter examples to the distance preservation condition, and discuss possible implications of our theory for future model design.

artificial intelligence, convolution, neural network, (20 more...)

arXiv.org Machine Learning

2106.02469

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Improving Deterministic Uncertainty Estimation in Deep Learning for Classification and Regression

van Amersfoort, Joost, Smith, Lewis, Jesson, Andrew, Key, Oscar, Gal, Yarin

arXiv.org Machine LearningFeb-22-2021

We propose a new model that estimates uncertainty in a single forward pass and works on both classification and regression problems. Our approach combines a bi-Lipschitz feature extractor with an inducing point approximate Gaussian process, offering robust and principled uncertainty estimation. This can be seen as a refinement of Deep Kernel Learning (DKL), with our changes allowing DKL to match softmax neural networks accuracy. Our method overcomes the limitations of previous work addressing deterministic uncertainty quantification, such as the dependence of uncertainty on ad hoc hyper-parameters. Our method matches SotA accuracy, 96.2% on CIFAR-10, while maintaining the speed of softmax models, and provides uncertainty estimates that outperform previous single forward pass uncertainty models. Finally, we demonstrate our method on a recently introduced benchmark for uncertainty in regression: treatment deferral in causal models for personalized medicine.

approximation, deep learning, normalization, (17 more...)

arXiv.org Machine Learning

2102.11409

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semi-supervised Learning of Galaxy Morphology using Equivariant Transformer Variational Autoencoders

Nishikawa-Toomey, Mizu, Smith, Lewis, Gal, Yarin

arXiv.org Machine LearningNov-17-2020

The growth in the number of galaxy images is much faster than the speed at which these galaxies can be labelled by humans. However, by leveraging the information present in the ever growing set of unlabelled images, semi-supervised learning could be an effective way of reducing the required labelling and increasing classification accuracy. We develop a Variational Autoencoder (VAE) with Equivariant Transformer layers with a classifier network from the latent space. We show that this novel architecture leads to improvements in accuracy when used for the galaxy morphology classification task on the Galaxy Zoo data set. In addition we show that pre-training the classifier network as part of the VAE using the unlabelled data leads to higher accuracy with fewer labels compared to exiting approaches. This novel VAE has the potential to automate galaxy morphology classification with reduced human labelling efforts.

inductive learning, labelled data, neural network, (17 more...)

arXiv.org Machine Learning

2011.08714

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.77)

Add feedback

Simple and Scalable Epistemic Uncertainty Estimation Using a Single Deep Deterministic Neural Network

van Amersfoort, Joost, Smith, Lewis, Teh, Yee Whye, Gal, Yarin

arXiv.org Machine LearningMar-4-2020

We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. Our approach, deterministic uncertainty quantification (DUQ), builds upon ideas of RBF networks. We scale training in these with a novel loss function and centroid updating scheme. By enforcing detectability of changes in the input using a gradient penalty, we are able to reliably detect out of distribution data. Our uncertainty quantification scales well to large datasets, and using a single model, we improve upon or match Deep Ensembles on notable difficult dataset pairs such as FashionMNIST vs. MNIST, and CIFAR-10 vs. SVHN, while maintaining competitive accuracy.

deep learning, neural network, penalty, (20 more...)

arXiv.org Machine Learning

2003.02037

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Flood Detection On Low Cost Orbital Hardware

Mateo-Garcia, Gonzalo, Oprea, Silviu, Smith, Lewis, Veitch-Michaelis, Josh, Schumann, Guy, Gal, Yarin, Baydin, Atılım Güneş, Backes, Dietmar

arXiv.org Machine LearningOct-14-2019

Satellite imaging is a critical technology for monitoring and responding to natural disasters such as flooding. Despite the capabilities of modern satellites, there is still much to be desired from the perspective of first response organisations like UNICEF. Two main challenges are rapid access to data, and the ability to automatically identify flooded regions in images. We describe a prototypical flood segmentation system, identifying cloud, water and land, that could be deployed on a constellation of small satellites, performing processing on board to reduce downlink bandwidth by 2 orders of magnitude. We target PhiSat-1, part of the FSSCAT mission, which is planned to be launched by the European Space Agency (ESA) near the start of 2020 as a proof of concept for this new technology.

deep learning, neural network, satellite, (19 more...)

arXiv.org Machine Learning

1910.03019

Country: Europe > United Kingdom > England (0.15)

Genre: Research Report (0.50)

Industry:

Government (0.88)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study

Gal, Yarin, Smith, Lewis

arXiv.org Machine LearningJun-5-2018

We prove that idealised discriminative Bayesian neural networks, capturing perfect epistemic uncertainty, cannot have adversarial examples: Techniques for crafting adversarial examples will necessarily fail to generate perturbed images which fool the classifier. This suggests why MC dropout-based techniques have been observed to be fairly effective against adversarial examples. We support our claims mathematically and empirically. We experiment with HMC on synthetic data derived from MNIST for which we know the ground truth image density, showing that near-perfect epistemic uncertainty correlates to density under image manifold, and that adversarial images lie off the manifold. Using our new-found insights we suggest a new attack for MC dropout-based models by looking for imperfections in uncertainty estimation, and also suggest a mitigation. Lastly, we demonstrate our mitigation on a cats-vs-dogs image classification task with a VGG13 variant.

adversarial example, bayesian inference, neural network, (18 more...)

arXiv.org Machine Learning

1806.00667

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Understanding Measures of Uncertainty for Adversarial Example Detection

Smith, Lewis, Gal, Yarin

arXiv.org Machine LearningMar-22-2018

Measuring uncertainty is a promising technique for detecting adversarial examples, crafted inputs on which the model predicts an incorrect class with high confidence. But many measures of uncertainty exist, including predictive en- tropy and mutual information, each capturing different types of uncertainty. We study these measures, and shed light on why mutual information seems to be effective at the task of adversarial example detection. We highlight failure modes for MC dropout, a widely used approach for estimating uncertainty in deep models. This leads to an improved understanding of the drawbacks of current methods, and a proposal to improve the quality of uncertainty estimates using probabilistic model ensembles. We give illustrative experiments using MNIST to demonstrate the intuition underlying the different measures of uncertainty, as well as experiments on a real world Kaggle dogs vs cats classification dataset.

adversarial example, artificial intelligence, neural network, (15 more...)

arXiv.org Machine Learning

1803.08533

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback