Goto

Collaborating Authors

 knowledge


Learning to target with network interference

arXiv.org Machine Learning

This paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each individual's outcome can be affected by at most a few others. We first establish a regret lower bound showing that ignoring the network structure and reducing the problem to a standard linear bandit inevitably leads to inefficient learning, particularly in large populations. To understand how structural information can be leveraged, we analyze regimes with varying levels of knowledge of the interference structure: (1) full support knowledge, (2) knowledge of the column support sizes, and (3) no prior knowledge. For each regime, we establish regret lower bounds characterizing the fundamental limits of learning, and develop algorithms that achieve near-optimal regret. Together, our results provide a unified view of how knowledge of the interference structure governs the efficiency of online learning under interference, and offer practical adaptive targeting algorithms in each setting. Numerical experiments on synthetic and real-world data demonstrate the practical benefits of our algorithms.


Basketball-playing robot built by sixth-formers wins tech competition

BBC News

Meet the UK's very own LeBron James... but not as you know it Look out LeBron James and Michael Jordan, there's a new basketball champ around. But it was made in Lisburn rather than Los Angeles or Chicago. The name 25416 may not appear on many replica vests, but it can shoot hoops like no-one else. And the basketball-playing robot won a school in Lisburn first prize at the UK-wide First Tech Challenge robotics competition. The team of sixth-formers from Friends' School came top of 48 schools from across the UK at the competition held in London's Copper Box Arena. Going down and working on it with my friends is honestly one of the highlights of my last year in school, he said.


Distribution-free root cause analysis

arXiv.org Machine Learning

We study distribution-free root cause analysis in multi-stream data, where an evolving underlying system is observed through multiple data streams that may each undergo distributional changes at unknown timepoints. In such settings, the stream exhibiting the earliest change provides a natural starting point for investigating the underlying cause, which we refer to as the root-cause index. Leveraging conformal $p$-values, we propose a novel framework, Conformal Root Cause Analysis (CROC), which constructs finite-sample valid confidence sets for the root-cause index under minimal assumptions: the data streams are independent, and within each stream the pre- and post-change observations are sampled exchangeably from arbitrary and unknown distributions. We further establish a universality property, showing that any distribution-free method for root cause localization can be represented within the CROC framework. In addition, under mild regularity conditions and principled score design, our method yields asymptotically sharp confidence sets that efficiently isolate the root cause. We further extend CROC to efficiently handle cross-stream dependence when present. Extensive simulations demonstrate accurate localization of the root stream, supporting our theoretical guarantees.


Logging Policy Design for Off-Policy Evaluation

arXiv.org Machine Learning

Off-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender system) using data collected by a different logging policy. It enables high-stakes experimentation without live deployment, yet in practice accuracy depends heavily on the logging policy used to collect data for computing the estimate. We study how to design logging policies that minimize OPE error for given target policies. We characterize a fundamental reward-coverage tradeoff: concentrating probability mass on high-reward actions reduces variance but risks missing signal on actions the target policy may take. We propose a unifying framework for logging policy design and derive optimal policies in canonical informational regimes where the target policy and reward distribution are (i) known, (ii) unknown, and (iii) partially known through priors or noisy estimates at logging time. Our results provide actionable guidance for firms choosing among multiple candidate recommendation systems. We demonstrate the importance of treatment selection when gathering data for OPE, and describe theoretically optimal approaches when this is a firm's primary objective. We also distill practical design principles for selecting logging policies when operational constraints prevent implementing the theoretical optimum.


Why autism pioneer Uta Frith wants to dismantle the spectrum

New Scientist

Uta Frith seems remarkably cheerful and content for someone who's spent six decades trying and failing to get to grips with her life's obsession. "Very little has stood the test of time," she tells me as we sit down in her living room in a leafy estate in Harrow-on-the-Hill, London. Around us, high-ceilinged walls papered in a luxurious red print are barely visible between rammed bookshelves, several model brains and a collection of abstract art. Frith has been searching for the mechanisms that underpin the enigmatic condition of autism ever since she first met profoundly autistic children in the late 1960s. "We could identify them intuitively, but not really scientifically - and I have to say that this is, unfortunately, still the case." Still, Frith's influence on our ever-shifting understanding of autism has been monumental.


Why the Future of College Could Look Like OnlyFans

The New Yorker

Universities have become generic, one professor and former dean argues. In the A.I. era, students may demand something they can't get elsewhere. Last week, I asked whether, as a forty-six-year-old father of two, I should keep contributing to my children's college funds, or if perhaps some combination of anti-establishment fervor, A.I., and a shifting economy could save me some money. I don't have a particularly good answer yet, at least not one good enough to inspire the purchase of a midlife-crisis car, my son's and daughter's futures be damned. But, after wrestling with that query in Part 1 of what will be a series of articles, I think there may be a better one to ask. The question is not, I think, "How will A.I. change higher education?" I wanted to talk with someone who stood outside the polite consensus which holds that college as we know it will survive, if only because, as I wrote last week, humans will always want to differentiate their children from other people's children.


Optimal Regret for Single Index Bandits

arXiv.org Machine Learning

We study the $\textit{single-index bandit}$ problem, where rewards depend on an unknown one-dimensional projection of high-dimensional contexts through an unknown reward function. This model extends linear and generalized linear bandits to a nonparametric setting, and is particularly relevant when the reward function is not known in advance. While optimal regret guarantees are known for monotone reward functions, the general non-monotone case remains poorly understood, with the best known bound being $\tilde{\mathcal{O}}(T^{3/4})$ (under standard boundedness and Lipschitz assumptions on the reward function [Kang et al., 2025]). We close this gap by establishing the optimal regret for general single-index bandits. We propose a simple two-phase algorithm, namely, Zoomed Single Index Bandit with Upper Confidence Bound ($\texttt{ZoomSIB-UCB}$), that first estimates the projection direction via a normalized Stein estimator, and then reduces the problem to a one-dimensional bandit using discretization and finally use UCB. This approach achieves a regret of $\tilde{\mathcal{O}}(T^{2/3})$, and improves significantly upon prior work without any additional assumptions. We also prove a matching minimax lower bound of $\tildeฮฉ(T^{2/3})$, showing that the upper bound is essentially tight. Our upper and lower bounds together provide a sharp characterization of the regret in single-index bandits. Moreover, the empirical results further demonstrate the effectiveness and robustness of our approach.


Black-box optimization of noisy functions with unknown smoothness

arXiv.org Machine Learning

We study the problem of black-box optimization of a function f of any dimension, given function evaluations perturbed by noise. The function is assumed to be locally smooth around one of its global optima, but this smoothness is unknown. Our contribution is an adaptive optimization algorithm, POO or parallel optimistic optimization, that is able to deal with this setting. POO performs almost as well as the best known algorithms requiring the knowledge of the smoothness. Furthermore, POO works for a larger class of functions than what was previously considered, especially for functions that are difficult to optimize, in a very precise sense. We provide a finite-time analysis of POO's performance, which shows that its error after n evaluations is at most a factor of sqrt(ln n) away from the error of the best known optimization algorithms using the knowledge of the smoothness.



or Sound Symbolism in Vision and Language Models

Neural Information Processing Systems

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and welldemonstrated with regards to cross-modal associations between language and the visual domain. In this work, we address the question of whether sound symbolism is reflected in vision-and-language models such as CLIP and Stable Diffusion. Using zero-shot knowledge probing to investigate the inherent knowledge of these models, we find strong evidence that they do show this pattern, paralleling the well-known kiki-bouba effect in psycholinguistics. Our work provides a novel method for demonstrating sound symbolism and understanding its nature using computational tools. Our code will be made publicly available1.