AITopics | Bronskill, John

Collaborating Authors

Bronskill, John

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

JoLT: Joint Probabilistic Predictions on Tabular Data Using LLMs

Shysheya, Aliaksandra, Bronskill, John, Requeima, James, Siddiqui, Shoaib Ahmed, Gonzalez, Javier, Duvenaud, David, Turner, Richard E.

arXiv.org Machine LearningFeb-17-2025

We introduce a simple method for probabilistic predictions on tabular data based on Large Language Models (LLMs) called JoLT (Joint LLM Process for Tabular data). JoLT uses the in-context learning capabilities of LLMs to define joint distributions over tabular data conditioned on user-specified side information about the problem, exploiting the vast repository of latent problem-relevant knowledge encoded in LLMs. JoLT defines joint distributions for multiple target variables with potentially heterogeneous data types without any data conversion, data preprocessing, special handling of missing data, or model training, making it accessible and efficient for practitioners. Our experiments show that JoLT outperforms competitive methods on low-shot single-target and multi-target tabular classification and regression tasks. Furthermore, we show that JoLT can automatically handle missing data and perform data imputation by leveraging textual side information. We argue that due to its simplicity and generality, JoLT is an effective approach for a wide variety of real prediction problems.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2502.11877

Country:

Europe (1.00)
North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Olympic Games (1.00)
Health & Medicine > Therapeutic Area (0.93)
Media > Film (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language

Requeima, James, Bronskill, John, Choi, Dami, Turner, Richard E., Duvenaud, David

arXiv.org Machine LearningMay-25-2024

Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed to integrate this prior knowledge into probabilistic modeling typically limits the application of these models to specialists. Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations, guided by natural language text which describes a user's prior knowledge. Large Language Models (LLMs) provide a useful starting point for designing such a tool since they 1) provide an interface where users can incorporate expert insights in natural language and 2) provide an opportunity for leveraging latent problem-relevant knowledge encoded in LLMs that users may not have themselves. We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from LLMs. We examine these joint predictive distributions, which we call LLM Processes, over arbitrarily-many quantities in settings such as forecasting, multi-dimensional regression, black-box optimization, and image modeling. We investigate the practical details of prompting to elicit coherent predictive distributions, and demonstrate their effectiveness at regression. Finally, we demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions. This lets us begin to explore the rich, grounded hypothesis space that LLMs implicitly encode.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2405.12856

Country:

North America > United States > Texas (0.28)
North America > Canada > Ontario > Toronto (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

SportsNGEN: Sustained Generation of Multi-player Sports Gameplay

Thorpe, Lachlan, Bawden, Lewis, Vendal, Karanjot, Bronskill, John, Turner, Richard E.

arXiv.org Artificial IntelligenceFeb-9-2024

We present a transformer decoder based model, SportsNGEN, that is trained on sports player and ball tracking sequences that is capable of generating realistic and sustained gameplay. We train and evaluate SportsNGEN on a large database of professional tennis tracking data and demonstrate that by combining the generated simulations with a shot classifier and logic to start and end rallies, the system is capable of simulating an entire tennis match. In addition, a generic version of SportsNGEN can be customized to a specific player by fine-tuning on match data that includes that player. We show that our model is well calibrated and can be used to derive insights for coaches and broadcasters by evaluating counterfactual or what if options. Finally, we show qualitative results indicating the same approach works for football.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.12977

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On the Efficacy of Differentially Private Few-shot Image Classification

Tobaben, Marlon, Shysheya, Aliaksandra, Bronskill, John, Paverd, Andrew, Tople, Shruti, Zanella-Beguelin, Santiago, Turner, Richard E, Honkela, Antti

arXiv.org Machine LearningDec-19-2023

There has been significant recent progress in training differentially private (DP) models which achieve accuracy that approaches the best non-private models. These DP models are typically pretrained on large public datasets and then fine-tuned on private downstream datasets that are relatively large and similar in distribution to the pretraining data. However, in many applications including personalization and federated learning, it is crucial to perform well (i) in the few-shot setting, as obtaining large amounts of labeled data may be problematic; and (ii) on datasets from a wide variety of domains for use in various specialist settings. To understand under which conditions few-shot DP can be effective, we perform an exhaustive set of experiments that reveals how the accuracy and vulnerability to attack of few-shot DP image classification models are affected as the number of shots per class, privacy level, model architecture, downstream dataset, and subset of learnable parameters in the model vary. We show that to achieve DP accuracy on par with non-private models, the shots per class must be increased as the privacy level increases. We also show that learning parameter-efficient FiLM adapters under DP is competitive with learning just the final classifier layer or learning all of the network parameters. Finally, we evaluate DP federated learning systems and establish state-of-the-art performance on the challenging FLAIR benchmark.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2302.0119

Country:

North America > United States > California (1.00)
Europe (1.00)
North America > Canada (0.92)

Genre: Research Report (0.83)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification

Shysheya, Aliaksandra, Bronskill, John, Patacchiola, Massimiliano, Nowozin, Sebastian, Turner, Richard E

arXiv.org Machine LearningFeb-2-2023

Modern deep learning systems are increasingly deployed in situations such as personalization and federated learning where it is necessary to support i) learning on small amounts of data, and ii) communication efficient distributed training protocols. In this work, we develop FiLM Transfer (FiT) which fulfills these requirements in the image classification setting by combining ideas from transfer learning (fixed pretrained backbones and fine-tuned FiLM adapter layers) and meta-learning (automatically configured Naive Bayes classifiers and episodic training) to yield parameter efficient models with superior classification accuracy at low-shot. The resulting parameter efficiency is key for enabling few-shot learning, inexpensive model updates for personalization, and communication efficient federated learning. We experiment with FiT on a wide range of downstream datasets and show that it achieves better classification accuracy than the leading Big Transfer (BiT) algorithm at low-shot and achieves state-of-the art accuracy on the challenging VTAB-1k benchmark, with fewer than 1% of the updateable parameters. Finally, we demonstrate the parameter efficiency and superior accuracy of FiT in distributed low-shot applications including model personalization and federated learning where model update size is an important performance metric.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2206.08671

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.68)
Health & Medicine (0.67)
Leisure & Entertainment (0.46)
Energy (0.46)

Add feedback

Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning Few-Shot Meta-Learners

Oldewage, Elre T., Bronskill, John, Turner, Richard E.

arXiv.org Artificial IntelligenceNov-23-2022

This paper examines the robustness of deployed few-shot meta-learning systems when they are fed an imperceptibly perturbed few-shot dataset. We attack amortized meta-learners, which allows us to craft colluding sets of inputs that are tailored to fool the system's learning algorithm when used as training data. Jointly crafted adversarial inputs might be expected to synergistically manipulate a classifier, allowing for very strong data-poisoning attacks that would be hard to detect. We show that in a white box setting, these attacks are very successful and can cause the target model's predictions to become worse than chance. However, in opposition to the well-known transferability of adversarial examples in general, the colluding sets do not transfer well to different classifiers. We explore two hypotheses to explain this: "overfitting" by the attack, and mismatch between the model on which the attack is generated and that to which the attack is transferred. Regardless of the mitigation strategies suggested by these hypotheses, the colluding inputs transfer no better than adversarial inputs that are generated independently in the usual way.

artificial intelligence, feature extractor, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2211.1299

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Memory Efficient Meta-Learning with Large Images

Bronskill, John, Massiceti, Daniela, Patacchiola, Massimiliano, Hofmann, Katja, Nowozin, Sebastian, Turner, Richard E.

arXiv.org Machine LearningJul-2-2021

Meta learning approaches to few-shot classification are computationally efficient at test time requiring just a few optimization steps or single forward pass to learn a new task, but they remain highly memory-intensive to train. This limitation arises because a task's entire support set, which can contain up to 1000 images, must be processed before an optimization step can be taken. Harnessing the performance gains offered by large images thus requires either parallelizing the meta-learner across multiple GPUs, which may not be available, or trade-offs between task and image size when memory constraints apply. We improve on both options by proposing LITE, a general and memory efficient episodic training scheme that enables meta-training on large tasks composed of large images on a single GPU. We achieve this by observing that the gradients for a task can be decomposed into a sum of gradients over the task's training images. This enables us to perform a forward pass on a task's entire training set but realize significant memory savings by back-propagating only a random subset of these images which we show is an unbiased approximation of the full gradient. We use LITE to train meta-learners and demonstrate new state-of-the-art accuracy on the real-world ORBIT benchmark and 3 of the 4 parts of the challenging VTAB+MD benchmark relative to leading meta-learners. LITE also enables meta-learners to be competitive with transfer learning approaches but at a fraction of the test-time computational cost, thus serving as a counterpoint to the recent narrative that transfer learning is all you need for few-shot classification.

deep learning, lite, neural network, (19 more...)

arXiv.org Machine Learning

2107.01105

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes

Requeima, James, Gordon, Jonathan, Bronskill, John, Nowozin, Sebastian, Turner, Richard E.

arXiv.org Machine LearningJun-18-2019

The goal of this paper is to design image classification systems that, after an initial multi-task training phase, can automatically adapt to new tasks encountered at test time. We introduce a conditional neural process based approach to the multi-task classification setting for this purpose, and establish connections to the meta-learning and few-shot learning literature. The resulting approach, called CNAPs, comprises a classifier whose parameters are modulated by an adaptation network that takes the current task's dataset as input. We demonstrate that CNAPs achieves state-of-the-art results on the challenging Meta-Dataset benchmark indicating high-quality transfer-learning. We show that the approach is robust, avoiding both over-fitting in low-shot regimes and under-fitting in high-shot regimes. Timing experiments reveal that CNAPs is computationally efficient at test-time as it does not involve gradient based adaptation. Finally, we show that trained models are immediately deployable to continual learning and active learning where they can outperform existing approaches that do not leverage transfer learning.

dataset, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1906.07697

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Decision-Theoretic Meta-Learning: Versatile and Efficient Amortization of Few-Shot Learning

Gordon, Jonathan, Bronskill, John, Bauer, Matthias, Nowozin, Sebastian, Turner, Richard E.

arXiv.org Machine LearningMay-31-2018

This paper develops a general framework for data efficient and versatile deep learning. The new framework comprises three elements: 1) Discriminative probabilistic models from multi-task learning that leverage shared statistical information across tasks. 2) A novel Bayesian decision theoretic approach to meta-learning probabilistic inference across many tasks. 3) A fast, flexible, and simple to train amortization network that can automatically generalize and extrapolate to a wide range of settings. The VERSA algorithm, a particular instance of the framework, is evaluated on a suite of supervised few-shot learning tasks. VERSA achieves state-of-the-art performance in one-shot learning on Omniglot and miniImagenet, and produces compelling results on a one-shot ShapeNet view reconstruction challenge.

artificial intelligence, machine learning, versatile and efficient amortization, (1 more...)

arXiv.org Machine Learning

1805.09921

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.53)

Add feedback