Goto

Collaborating Authors

 Oxfordshire


What would make the UK a better place to live? A new project aims to find out

BBC News

What would make the UK a better place to live? People across the UK are being urged to share their vision for how their community and country's future should look, as part of a major new research project. The National Conversation is being launched with voice notes submitted by high-profile figures, including former footballer Gary Lineker, Chief Rabbi Sir Ephraim Mirvis, and broadcaster Mariella Frostrup. Participants will be asked to complete a survey carried out by researchers from the University of Oxford and leave a 60-second voice note. AI models will then be used to analyse thousands of responses to map what could bring us together.


Supplementary Material for " Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations "

Neural Information Processing Systems

Potential negative societal impacts Although our work improves the performance of text-video retrieval, but may reduce the difficulty of cross-modal retrieval of sensitive information on the network. It may raise challenges to protecting information security. Limitations of our work Iterative approaches are sensitive to initialization and parameters such as the dimensions and the number of subspaces. In our work, although we use the L2 normalization operation to limit the value range of the parameters, the EM algorithm [3] may still converge to bad results. At the same time, the selection of the number of subspaces also has a relatively significant impact on the model effect.



Domain Invariant Representation Learning with Domain Density Transformations

Neural Information Processing Systems

Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to target domains. To tackle this problem, a predominant domain generalization approach is to learn some domain-invariant information for the prediction task, aiming at a good generalization across domains. In this paper, we propose a theoretically grounded method to learn a domain-invariant representation by enforcing the representation network to be invariant under all transformation functions among domains. We next introduce the use of generative adversarial networks to learn such domain transformations in a possible implementation of our method in practice. We demonstrate the effectiveness of our method on several widely used datasets for the domain generalization problem, on all of which we achieve competitive results with state-of-the-art models.


Robot Talk Episode 153 – Origami-inspired robots, with Chenying Liu

Robohub

Claire chatted to Chenying Liu from University of Oxford about how a robot's physical form can actively contribute to sensing, processing, decision-making, and movement. Chenying Liu is a Junior Research Fellow and an Associate Member of Faculty in the Department of Engineering Science at the University of Oxford. She leads an independent research programme focused on embodied physical intelligence, exploring how robot design can integrate geometry, materials, and control to enhance autonomy and robustness. Her work aims to develop more efficient and resilient robotic systems by embedding intelligence directly into their physical structures. Robot Talk is a weekly podcast that explores the exciting world of robotics, artificial intelligence and autonomous machines.


Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions

arXiv.org Machine Learning

Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.


Learning to Emulate Chaos: Adversarial Optimal Transport Regularization

arXiv.org Machine Learning

Chaos arises in many complex dynamical systems, from weather to power grids, but is difficult to accurately model using data-driven emulators, including neural operator architectures. For chaotic systems, the inherent sensitivity to initial conditions makes exact long-term forecasts theoretically infeasible, meaning that traditional squared-error losses often fail when trained on noisy data. Recent work has focused on training emulators to match the statistical properties of chaotic attractors by introducing regularization based on handcrafted local features and summary statistics, as well as learned statistics extracted from a diverse dataset of trajectories. In this work, we propose a family of adversarial optimal transport objectives that jointly learn high-quality summary statistics and a physically consistent emulator. We theoretically analyze and experimentally validate a Sinkhorn divergence formulation (2-Wasserstein) and a WGAN-style dual formulation (1-Wasserstein). Our experiments across a variety of chaotic systems, including systems with high-dimensional chaotic attractors, show that emulators trained with our approach exhibit significantly improved long-term statistical fidelity.


What Will It Take to Get A.I. Out of Schools?

The New Yorker

What Will It Take to Get A.I. Out of Schools? The tech world assumes that A.I.-aided education is necessary and inevitable. A growing number of parents, educators, and cognitive scientists say the opposite. I don't like A.I., and I am raising my children not to like it. I've been telling them for years now that chatbots are manipulative and dangerous, that A.I.-image generators are loosening our collective grip on reality, that large language models are built atop industrial-scale intellectual-property theft. At times, I find myself speaking with my kids about A.I. in the same terms that we might discuss a creepy neighbor who lives down the block: avoid eye contact, cross the street when you walk past his house, and, when in doubt, call on a trusted adult. Yes, I, too, have suspected that the creepy neighbor walks on cloven hooves inside his Yeezy Boosts, but he probably isn't going anywhere--in fact, he keeps buying up properties around town--so just try your best not to engage. Somehow, I was not prepared for the creepy neighbor to start hanging around my kids' schools; somehow, I thought we had until high school.


Adaptive Neural Compilation

Neural Information Processing Systems

This paper proposes an adaptive neural-compilation framework to address the problem of learning efficient programs. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that make the code faster to execute without changing its semantics. In contrast, our work involves adapting programs to make them more efficient while considering correctness only on a target input distribution. Our approach is inspired by the recent works on differentiable representations of programs. We show that it is possible to compile programs written in a low-level language to a differentiable representation. We also show how programs in this representation can be optimised to make them efficient on a target input distribution. Experimental results demonstrate that our approach enables learning specifically-tuned algorithms for given data distributions with a high success rate.


Gaussian Processes for Survival Analysis

Neural Information Processing Systems

We introduce a semi-parametric Bayesian model for survival analysis. The model is centred on a parametric baseline hazard, and uses a Gaussian process to model variations away from it nonparametrically, as well as dependence on covariates. As opposed to many other methods in survival analysis, our framework does not impose unnecessary constraints in the hazard rate or in the survival function. Furthermore, our model handles left, right and interval censoring mechanisms common in survival analysis. We propose a MCMC algorithm to perform inference and an approximation scheme based on random Fourier features to make computations faster. We report experimental results on synthetic and real data, showing that our model performs better than competing models such as Cox proportional hazards, ANOVA-DDP and random survival forests.