Goto

Collaborating Authors

 Asia


Deconfounding Scores and Representation Learning for Causal Effect Estimation with Weak Overlap

arXiv.org Machine Learning

Overlap, also known as positivity, is a key condition for causal treatment effect estimation. Many popular estimators suffer from high variance and become brittle when features differ strongly across treatment groups. This is especially challenging in high dimensions: the curse of dimensionality can make overlap implausible. To address this, we propose a class of feature representations called deconfounding scores, which preserve both identification and the target of estimation; the classical propensity and prognostic scores are two special cases. We characterize the problem of finding a representation with better overlap as minimizing an overlap divergence under a deconfounding score constraint. We then derive closed-form expressions for a class of deconfounding scores under a broad family of generalized linear models with Gaussian features and show that prognostic scores are overlap-optimal within this class. We conduct extensive experiments to assess this behavior empirically.


Phase transition on a context-sensitive random language model with short range interactions

arXiv.org Machine Learning

Since the random language model was proposed by E. DeGiuli [Phys. Rev. Lett. 122, 128301], language models have been investigated intensively from the viewpoint of statistical mechanics. Recently, the existence of a Berezinskii--Kosterlitz--Thouless transition was numerically demonstrated in models with long-range interactions between symbols. In statistical mechanics, it has long been known that long-range interactions can induce phase transitions. Therefore, it has remained unclear whether phase transitions observed in language models originate from genuinely linguistic properties that are absent in conventional spin models. In this study, we construct a random language model with short-range interactions and numerically investigate its statistical properties. Our model belongs to the class of context-sensitive grammars in the Chomsky hierarchy and allows explicit reference to contexts. We find that a phase transition occurs even when the model refers only to contexts whose length remains constant with respect to the sentence length. This result indicates that finite-temperature phase transitions in language models are genuinely induced by the intrinsic nature of language, rather than by long-range interactions.


Optimistic Actor-Critic with Parametric Policies for Linear Markov Decision Processes

arXiv.org Machine Learning

Although actor-critic methods have been successful in practice, their theoretical analyses have several limitations. Specifically, existing theoretical work either sidesteps the exploration problem by making strong assumptions or analyzes impractical methods with complicated algorithmic modifications. Moreover, the actor-critic methods analyzed for linear MDPs often employ natural policy gradient and construct "implicit" policies without explicit parameterization. Such policies are computationally expensive to sample from, making the environment interactions inefficient. To that end, we focus on the finite-horizon linear MDPs and propose an optimistic actor-critic framework that uses parametric log-linear policies. In particular, we introduce a tractable $\textit{logit-matching}$ regression objective for the actor. For the critic, we use approximate Thompson sampling via Langevin Monte Carlo to obtain optimistic value estimates. We prove that the resulting algorithm achieves $\widetilde{\mathcal{O}}(ε^{-4})$ and $\widetilde{\mathcal{O}}(ε^{-2})$ sample complexity in the on-policy and off-policy setting, respectively. Our results match prior theoretical work in achieving the state-of-the-art sample complexity, while our algorithm is more aligned with practice.


Michael Pollan: 'Consciousness is really under siege'

New Scientist

Michael Pollan: 'Consciousness is really under siege' A psychedelic experience set author Michael Pollan on a quest to understand consciousness in his new book A World Appears. Michael Pollan: "Psychedelics have a way of smudging the windshield of experience" Author Michael Pollan has tackled plants, food and psychedelics in bestselling books including The Omnivore's Dilemma and How to Change Your Mind . Now, he has taken on the thorny problem of consciousness. In his latest book, Pollan charts the work of scientists and philosophers, weaving in literary perspectives along the way. He spoke to New Scientist about the value of writing a book where you know less at the end than before you started.


The first quantum computer to break encryption is now shockingly close

New Scientist

A quantum computer capable of breaking the encryption that secures the internet now seems to be just around the corner. Stunning revelations from two research teams outline how it could happen, with one suggesting that the current largest quantum machine is already more than halfway towards the size needed. The two studies concern an encryption technique built around the elliptic curve discrete logarithm problem (ECDLP). The particulars of how this mathematical problem is solved made it a good candidate for encrypting data and led to its widespread adoption for securing lots of internet communication, including bank transactions, and nearly every major cryptocurrency, including bitcoin. It is extremely difficult for conventional computers to crack ECDLP-based encryption, but since the 1990s researchers have known that quantum computers wouldn't have the same trouble.


The best new science-fiction books of April 2026

New Scientist

A collection of stories set in George R. R. Martin's universe and a novel from author James S. A. Corey are among the science-fiction books we're looking forward to this month I am currently reading the science-fiction classic by Kim Stanley Robinson with the New Scientist Book Club (it's our April read). It's fantastic, so any other trips to the Red Planet are very welcome from my perspective, and I'm looking forward to Charlotte Robinson's thriller . Elsewhere in this month's science fiction, there's horror in space from S. A. Barnes, some resurrected Neanderthals from Douglas Preston and his daughter Aletheia Preston, and ghosts in AI-generated videos from Max Lury. Something for all tastes, I'd say. This near-future space-thriller follows a one-way mission to Mars, as well as the disappearance of a programmer in Hong Kong, who leaves nothing behind but a cryptic warning. As the Argo spaceship heads towards Mars, the crew realise they are being sabotaged.


Refined Detection for Gumbel Watermarking

arXiv.org Machine Learning

We propose a simple detection mechanism for the Gumbel watermarking scheme proposed by Aaronson (2022). The new mechanism is proven to be near-optimal in a problem-dependent sense among all model-agnostic watermarking schemes under the assumption that the next-token distribution is sampled i.i.d.


mlr3mbo: Bayesian Optimization in R

arXiv.org Machine Learning

We present mlr3mbo, a comprehensive and modular toolbox for Bayesian optimization in R. mlr3mbo supports single- and multi-objective optimization, multi-point proposals, batch and asynchronous parallelization, input and output transformations, and robust error handling. While it can be used for many standard Bayesian optimization variants in applied settings, researchers can also construct custom BO algorithms from its flexible building blocks. In addition to an introduction to the software, its design principles, and its building blocks, the paper presents two extensive empirical evaluations of the software on the surrogate-based benchmark suite YAHPO Gym. To identify robust default configurations for both numeric and mixed-hierarchical optimization regimes, and to gain further insights into the respective impacts of individual settings, we run a coordinate descent search over the mlr3mbo configuration space and analyze its results. Furthermore, we demonstrate that mlr3mbo achieves state-of-the-art performance by benchmarking it against a wide range of optimizers, including HEBO, SMAC3, Ax, and Optuna.


Unbounded Density Ratio Estimation and Its Application to Covariate Shift Adaptation

arXiv.org Machine Learning

This paper focuses on the problem of unbounded density ratio estimation -- an understudied yet critical challenge in statistical learning -- and its application to covariate shift adaptation. Much of the existing literature assumes that the density ratio is either uniformly bounded or unbounded but known exactly. These conditions are often violated in practice, creating a gap between theoretical guarantees and real-world applicability. In contrast, this work directly addresses unbounded density ratios and integrates them into importance weighting for effective covariate shift adaptation. We propose a three-step estimation method that leverages unlabeled data from both the source and target distributions: (1) estimating a relative density ratio; (2) applying a truncation operation to control its unboundedness; and (3) transforming the truncated estimate back into the standard density ratio. The estimated density ratio is then employed as importance weights for regression under covariate shift. We establish rigorous, non-asymptotic convergence guarantees for both the proposed density ratio estimator and the resulting regression function estimator, demonstrating optimal or near-optimal convergence rates. Our findings offer new theoretical insights into density ratio estimation and learning under covariate shift, extending classical learning theory to more practical and challenging scenarios.


Kinetic Langevin Splitting Schemes for Constrained Sampling

arXiv.org Machine Learning

Constrained sampling is an important and challenging task in computational statistics, concerned with generating samples from a distribution under certain constraints. There are numerous types of algorithm aimed at this task, ranging from general Markov chain Monte Carlo, to unadjusted Langevin methods. In this article we propose a series of new sampling algorithms based on the latter of these, specifically the kinetic Langevin dynamics. Our series of algorithms are motivated on advanced numerical methods which are splitting order schemes, which include the BU and BAO families of splitting schemes.Their advantage lies in the fact that they have favorable strong order (bias) rates and computationally efficiency. In particular we provide a number of theoretical insights which include a Wasserstein contraction and convergence results. We are able to demonstrate favorable results, such as improved complexity bounds over existing non-splitting methodologies. Our results are verified through numerical experiments on a range of models with constraints, which include a toy example and Bayesian linear regression.