AITopics | Williams, Christopher

Collaborating Authors

Williams, Christopher

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Score-Optimal Diffusion Schedules

Williams, Christopher, Campbell, Andrew, Doucet, Arnaud, Syed, Saifuddin

arXiv.org Machine LearningDec-10-2024

DDMs generate a path of probability distributions interpolating between a reference Gaussian distribution and a data distribution by incrementally injecting noise into the data. To numerically simulate the sampling process, a discretisation schedule from the reference back towards clean data must be chosen. An appropriate discretisation schedule is crucial to obtain high quality samples. However, beyond hand crafted heuristics, a general method for choosing this schedule remains elusive. This paper presents a novel algorithm for adaptively selecting an optimal discretisation schedule with respect to a cost that we derive. Our cost measures the work done by the simulation procedure to transport samples from one point in the diffusion path to the next. Our method does not require hyperparameter tuning and adapts to the dynamics and geometry of the diffusion path. Our algorithm only involves the evaluation of the estimated Stein score, making it scalable to existing pre-trained models at inference time and online during training. We find that our learned schedule recovers performant schedules previously only discovered through manual search and obtains competitive FID scores on image datasets.

artificial intelligence, discretisation schedule, machine learning, (16 more...)

arXiv.org Machine Learning

2412.07877

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection

Dauncey, Sam, Holmes, Chris, Williams, Christopher, Falck, Fabian

arXiv.org Machine LearningMar-3-2024

Likelihood-based deep generative models such as score-based diffusion models and variational autoencoders are state-of-the-art machine learning models approximating high-dimensional distributions of data such as images, text, or audio. One of many downstream tasks they can be naturally applied to is out-of-distribution (OOD) detection. However, seminal work by Nalisnick et al. which we reproduce showed that deep generative models consistently infer higher log-likelihoods for OOD data than data they were trained on, marking an open problem. In this work, we analyse using the gradient of a data point with respect to the parameters of the deep generative model for OOD detection, based on the simple intuition that OOD data should have larger gradient norms than training data. We formalise measuring the size of the gradient as approximating the Fisher information metric. We show that the Fisher information matrix (FIM) has large absolute diagonal values, motivating the use of chi-square distributed, layer-wise gradient norms as features. We combine these features to make a simple, model-agnostic and hyperparameter-free method for OOD detection which estimates the joint density of the layer-wise gradient norms for a given data point. We find that these layer-wise gradient norms are weakly correlated, rendering their combined usage informative, and prove that the layer-wise gradient norms satisfy the principle of (data representation) invariance. Our empirical results indicate that this method outperforms the Typicality test for most deep generative models and image dataset pairings.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Machine Learning

2403.01485

Country:

Europe > United Kingdom > England (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

A Unified Framework for U-Net Design and Analysis

Williams, Christopher, Falck, Fabian, Deligiannidis, George, Holmes, Chris, Doucet, Arnaud, Syed, Saifuddin

arXiv.org Machine LearningJan-10-2024

U-Nets are a go-to, state-of-the-art neural architecture across numerous tasks for continuous signals on a square such as images and Partial Differential Equations (PDE), however their design and architecture is understudied. In this paper, we provide a framework for designing and analysing general U-Net architectures. We present theoretical results which characterise the role of the encoder and decoder in a U-Net, their high-resolution scaling limits and their conjugacy to ResNets via preconditioning. We propose Multi-ResNets, U-Nets with a simplified, wavelet-based encoder without learnable parameters. Further, we show how to design novel U-Net architectures which encode function constraints, natural bases, or the geometry of the data. In diffusion models, our framework enables us to identify that high-frequency information is dominated by noise exponentially faster, and show how U-Nets with average pooling exploit this. In our experiments, we demonstrate how Multi-ResNets achieve competitive and often superior performance compared to classical U-Nets in image segmentation, PDE surrogate modelling, and generative modelling with diffusion models. Our U-Net framework paves the way to study the theoretical properties of U-Nets and design natural, scalable neural architectures for a multitude of problems beyond the square.

artificial intelligence, machine learning, u-net, (18 more...)

arXiv.org Machine Learning

2305.19638

Country:

North America > United States (0.45)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

Falck, Fabian, Williams, Christopher, Danks, Dominic, Deligiannidis, George, Yau, Christopher, Holmes, Chris, Doucet, Arnaud, Willetts, Matthew

arXiv.org Artificial IntelligenceJan-19-2023

U-Net architectures are ubiquitous in state-of-the-art deep learning, however their regularisation properties and relationship to wavelets are understudied. In this paper, we formulate a multi-resolution framework which identifies U-Nets as finite-dimensional truncations of models on an infinite-dimensional function space. We provide theoretical results which prove that average pooling corresponds to projection within the space of square-integrable functions and show that U-Nets with average pooling implicitly learn a Haar wavelet basis representation of the data. We then leverage our framework to identify state-of-the-art hierarchical VAEs (HVAEs), which have a U-Net architecture, as a type of two-step forward Euler discretisation of multi-resolution diffusion processes which flow from a point mass, introducing sampling instabilities. We also demonstrate that HVAEs learn a representation of time which allows for improved parameter efficiency through weight-sharing. We use this observation to achieve state-of-the-art HVAE performance with half the number of parameters of existing models, exploiting the properties of our continuous-time formulation.

artificial intelligence, hvae, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.08187

Country:

Europe > United Kingdom (0.45)
North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Government > Regional Government (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

A Generative Model for Parts-based Object Segmentation

Eslami, S., Williams, Christopher

Neural Information Processing SystemsDec-31-2012

The Shape Boltzmann Machine (SBM) has recently been introduced as a state-of-the-art model of foreground/background object shape. We extend the SBM to account for the foreground object's parts. Our model, the Multinomial SBM (MSBM), can capture both local and global statistics of part shapes accurately. We combine the MSBM with an appearance model to form a fully generative model of images of objects. Parts-based image segmentations are obtained simply by performing probabilistic inference in the model. We apply the model to two challenging datasets which exhibit significant shape and appearance variability, and find that it obtains results that are comparable to the state-of-the-art.

artificial intelligence, machine learning, segmentation, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

Add feedback

Multi-task Gaussian Process Learning of Robot Inverse Dynamics

Williams, Christopher, Klanke, Stefan, Vijayakumar, Sethu, Chai, Kian M.

Neural Information Processing SystemsDec-31-2009

The inverse dynamics problem for a robotic manipulator is to compute the torques needed at the joints to drive it along a given trajectory; it is beneficial to be able to learn this function for adaptive control. A given robot manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-task learning problem. We show how the structure of the inverse dynamics problem gives rise to a multi-task Gaussian process prior over functions, where the inter-task similarity depends on the underlying dynamic parameters. Experiments demonstrate that this multi-task formulation generally improves performance over either learning only on single tasks or pooling the data over all tasks.

artificial intelligence, machine learning, matrix, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Multi-task Gaussian Process Prediction

Bonilla, Edwin V., Chai, Kian M., Williams, Christopher

Neural Information Processing SystemsDec-31-2008

In this paper we investigate multi-task learning in the context of Gaussian Processes (GP).We propose a model that learns a shared covariance function on input-dependent features and a "free-form" covariance matrix over tasks. This allows forgood flexibility when modelling inter-task dependencies while avoiding the need for large amounts of data for training. We show that under the assumption ofnoise-free observations and a block design, predictions for a given task only depend on its target values and therefore a cancellation of inter-task transfer occurs.We evaluate the benefits of our model on two practical applications: a compiler performance prediction problem and an exam score prediction task. Additionally, we make use of GP approximations and properties of our model in order to provide scalability to large data sets.

artificial intelligence, covariance function, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.46)

Industry: Education (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care

Williams, Christopher, Quinn, John, Mcintosh, Neil

Neural Information Processing SystemsDec-31-2006

The observed physiological dynamics of an infant receiving intensive care are affected by many possible factors, including interventions to the baby, the operation of the monitoring equipment and the state of health. The Factorial Switching Kalman Filter can be used to infer the presence of such factors from a sequence of observations, and to estimate the true values where these observations have been corrupted. We apply this model to clinical time series data and show it to be effective in identifying a number of artifactual and physiological patterns.

Add feedback

Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care

Williams, Christopher, Quinn, John, Mcintosh, Neil

Neural Information Processing SystemsDec-31-2006

The observed physiological dynamics of an infant receiving intensive care are affected by many possible factors, including interventions to the baby, the operation of the monitoring equipment and the state of health. The Factorial Switching Kalman Filter can be used to infer the presence ofsuch factors from a sequence of observations, and to estimate the true values where these observations have been corrupted. We apply this model to clinical time series data and show it to be effective in identifying a number of artifactual and physiological patterns.

Add feedback

Using the Equivalent Kernel to Understand Gaussian Process Regression

Sollich, Peter, Williams, Christopher

Neural Information Processing SystemsDec-31-2005

The equivalent kernel [1] is a way of understanding how Gaussian process regressionworks for large sample sizes based on a continuum limit. In this paper we show (1) how to approximate the equivalent kernel of the widely-used squared exponential (or Gaussian) kernel and related kernels, and(2) how analysis using the equivalent kernel helps to understand the learning curves for Gaussian processes.

artificial intelligence, kernel, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Modeling & Simulation (0.82)

Add feedback