Goto

Collaborating Authors

 Oceania


Conditional Counterfactual Mean Embeddings: Doubly Robust Estimation and Learning Rates

arXiv.org Machine Learning

A complete understanding of heterogeneous treatment effects involves characterizing the full conditional distribution of potential outcomes. To this end, we propose the Conditional Counterfactual Mean Embeddings (CCME), a framework that embeds conditional distributions of counterfactual outcomes into a reproducing kernel Hilbert space (RKHS). Under this framework, we develop a two-stage meta-estimator for CCME that accommodates any RKHS-valued regression in each stage. Based on this meta-estimator, we develop three practical CCME estimators: (1) Ridge Regression estimator, (2) Deep Feature estimator that parameterizes the feature map by a neural network, and (3) Neural-Kernel estimator that performs RKHS-valued regression, with the coefficients parameterized by a neural network. We provide finite-sample convergence rates for all estimators, establishing that they possess the double robustness property. Our experiments demonstrate that our estimators accurately recover distributional features including multimodal structure of conditional counterfactual distributions.


Principles of Lipschitz continuity in neural networks

arXiv.org Machine Learning

Deep learning has achieved remarkable success across a wide range of domains, significantly expanding the frontiers of what is achievable in artificial intelligence. Yet, despite these advances, critical challenges remain -- most notably, ensuring robustness to small input perturbations and generalization to out-of-distribution data. These critical challenges underscore the need to understand the underlying fundamental principles that govern robustness and generalization. Among the theoretical tools available, Lipschitz continuity plays a pivotal role in governing the fundamental properties of neural networks related to robustness and generalization. It quantifies the worst-case sensitivity of network's outputs to small input perturbations. While its importance is widely acknowledged, prior research has predominantly focused on empirical regularization approaches based on Lipschitz constraints, leaving the underlying principles less explored. This thesis seeks to advance a principled understanding of the principles of Lipschitz continuity in neural networks within the paradigm of machine learning, examined from two complementary perspectives: an internal perspective -- focusing on the temporal evolution of Lipschitz continuity in neural networks during training (i.e., training dynamics); and an external perspective -- investigating how Lipschitz continuity modulates the behavior of neural networks with respect to features in the input data, particularly its role in governing frequency signal propagation (i.e., modulation of frequency signal propagation).


A principled framework for uncertainty decomposition in TabPFN

arXiv.org Machine Learning

TabPFN is a transformer that achieves state-of-the-art performance on supervised tabular tasks by amortizing Bayesian prediction into a single forward pass. However, there is currently no method for uncertainty decomposition in TabPFN. Because it behaves, in an idealised limit, as a Bayesian in-context learner, we cast the decomposition challenge as a Bayesian predictive inference (BPI) problem. The main computational tool in BPI, predictive Monte Carlo, is challenging to apply here as it requires simulating unmodeled covariates. We therefore pursue the asymptotic alternative, filling a gap in the theory for supervised settings by proving a predictive CLT under quasi-martingale conditions. We derive variance estimators determined by the volatility of predictive updates along the context. The resulting credible bands are fast to compute, target epistemic uncertainty, and achieve near-nominal frequentist coverage. For classification, we further obtain an entropy-based uncertainty decomposition.


GeoIB: Geometry-Aware Information Bottleneck via Statistical-Manifold Compression

arXiv.org Machine Learning

Information Bottleneck (IB) is widely used, but in deep learning, it is usually implemented through tractable surrogates, such as variational bounds or neural mutual information (MI) estimators, rather than directly controlling the MI I(X;Z) itself. The looseness and estimator-dependent bias can make IB "compression" only indirectly controlled and optimization fragile. We revisit the IB problem through the lens of information geometry and propose a \textbf{Geo}metric \textbf{I}nformation \textbf{B}ottleneck (\textbf{GeoIB}) that dispenses with mutual information (MI) estimation. We show that I(X;Z) and I(Z;Y) admit exact projection forms as minimal Kullback-Leibler (KL) distances from the joint distributions to their respective independence manifolds. Guided by this view, GeoIB controls information compression with two complementary terms: (i) a distribution-level Fisher-Rao (FR) discrepancy, which matches KL to second order and is reparameterization-invariant; and (ii) a geometry-level Jacobian-Frobenius (JF) term that provides a local capacity-type upper bound on I(Z;X) by penalizing pullback volume expansion of the encoder. We further derive a natural-gradient optimizer consistent with the FR metric and prove that the standard additive natural-gradient step is first-order equivalent to the geodesic update. We conducted extensive experiments and observed that the GeoIB achieves a better trade-off between prediction accuracy and compression ratio in the information plane than the mainstream IB baselines on popular datasets. GeoIB improves invariance and optimization stability by unifying distributional and geometric regularization under a single bottleneck multiplier. The source code of GeoIB is released at "https://anonymous.4open.science/r/G-IB-0569".


Condemnation of Elon Musk's AI chatbot reached 'tipping point' after French raid, Australia's eSafety chief says

The Guardian

Australia's eSafety commissioner has welcomed the global regulatory focus on Elon Musk's X after this week's raid in France. Australia's eSafety commissioner has welcomed the global regulatory focus on Elon Musk's X after this week's raid in France. The eSafety commissioner, Julie Inman Grant, says global regulatory focus on Elon Musk's X has reached a "tipping point" after a raid of the company's offices in France this week. The raid on Tuesday was part of an investigation that included alleged offences of complicity in the possession and organised distribution of child abuse images, violation of image rights through sexualised deepfakes, and denial of crimes against humanity. A number of other countries - including the UK and Australia - and the EU have launched investigations in the past few weeks into X after its AI chatbot, Grok, was used to mass-produce sexualised images of women and children in response to user requests.


Generator-based Graph Generation via Heat Diffusion

arXiv.org Machine Learning

Graph generative modelling has become an essential task due to the wide range of applications in chemistry, biology, social networks, and knowledge representation. In this work, we propose a novel framework for generating graphs by adapting the Generator Matching (arXiv:2410.20587) paradigm to graph-structured data. We leverage the graph Laplacian and its associated heat kernel to define a continous-time diffusion on each graph. The Laplacian serves as the infinitesimal generator of this diffusion, and its heat kernel provides a family of conditional perturbations of the initial graph. A neural network is trained to match this generator by minimising a Bregman divergence between the true generator and a learnable surrogate. Once trained, the surrogate generator is used to simulate a time-reversed diffusion process to sample new graph structures. Our framework unifies and generalises existing diffusion-based graph generative models, injecting domain-specific inductive bias via the Laplacian, while retaining the flexibility of neural approximators. Experimental studies demonstrate that our approach captures structural properties of real and synthetic graphs effectively.


Online Conformal Prediction via Universal Portfolio Algorithms

arXiv.org Machine Learning

Online conformal prediction (OCP) seeks prediction intervals that achieve long-run $1-α$ coverage for arbitrary (possibly adversarial) data streams, while remaining as informative as possible. Existing OCP methods often require manual learning-rate tuning to work well, and may also require algorithm-specific analyses. Here, we develop a general regret-to-coverage theory for interval-valued OCP based on the $(1-α)$-pinball loss. Our first contribution is to identify \emph{linearized regret} as a key notion, showing that controlling it implies coverage bounds for any online algorithm. This relies on a black-box reduction that depends only on the Fenchel conjugate of an upper bound on the linearized regret. Building on this theory, we propose UP-OCP, a parameter-free method for OCP, via a reduction to a two-asset portfolio selection problem, leveraging universal portfolio algorithms. We show strong finite-time bounds on the miscoverage of UP-OCP, even for polynomially growing predictions. Extensive experiments support that UP-OCP delivers consistently better size/coverage trade-offs than prior online conformal baselines.


Watch an albatross give its brand-new chick a very careful cleanup

Popular Science

The massive seabirds' powerful beaks can be surprisingly gentle when preening their babies. Breakthroughs, discoveries, and DIY tips sent six days a week. As thousands of birds nest in the warm sun of Midway Atoll, some tend to their new chicks. In a video posted by Friends of Midway Atoll (FOMA), one of the newest Mōlī (Laysan albatross) chicks gets a careful "beak preen" from its parent. According to FOMA, their beaks are essential survival tools, but can also be used with "precision and gentleness, applying only the pressure needed to tend to a fragile chick."


Teen discovers Australia's oldest dinosaur fossil--almost 70 years ago

Popular Science

Science Dinosaurs Teen discovers Australia's oldest dinosaur fossil--almost 70 years ago An early sauropodomorph likely made the 230-million-year-old footprint. Breakthroughs, discoveries, and DIY tips sent six days a week. In 1958, an Australian teenager named Bruce Runnegar uncovered a mysterious dinosaur footprint during a visit to a quarry with school friends. He kept the fossil for years, eventually becoming a paleontologist himself. Over six decades later, the prehistoric print is now ready for its close-up.


Viral AI personal assistant seen as step change – but experts warn of risks

The Guardian

One OpenClaw user said he recently allowed the bot to delete 75,000 of his old emails. One OpenClaw user said he recently allowed the bot to delete 75,000 of his old emails. OpenClaw is billed as'the AI that actually does things' and needs almost no input to potentially wreak havoc A new viral AI personal assistant will handle your email inbox, trade away your entire stock portfolio and text your wife "good morning" and "goodnight" on your behalf. OpenClaw, formerly known as Moltbot, and before that known as Clawdbot (until the AI firm Anthropic requested it rebrand due to similarities with its own product Claude), bills itself as "the AI that actually does things": a personal assistant that takes instructions via messaging apps such as WhatsApp or Telegram. Developed last November, it now has nearly 600,000 downloads and has gone viral among a niche ecosystem of the AI obsessed who say it represents a step change in the capabilities of AI agents, or even an "AGI moment" - that is, a revelation of generally intelligent AI. "It only does exactly what you tell it to do and exactly what you give it access to," said Ben Yorke, who works with the AI vibe trading platform Starchild and recently allowed the bot to delete, he claims, 75,000 of his old emails while he was in the shower.