Goto

Collaborating Authors

 neighborhood


Power one sequential tests exist for weakly compact $\mathscr P$ against $\mathscr P^c$

Ram, Ashwin, Ramdas, Aaditya

arXiv.org Machine Learning

Suppose we observe data from a distribution $P$ and we wish to test the composite null hypothesis that $P\in\mathscr P$ against a composite alternative $P\in \mathscr Q\subseteq \mathscr P^c$. Herbert Robbins and coauthors pointed out around 1970 that, while no batch test can have a level $α\in(0,1)$ and power equal to one, sequential tests can be constructed with this fantastic property. Since then, and especially in the last decade, a plethora of sequential tests have been developed for a wide variety of settings. However, the literature has not yet provided a clean and general answer as to when such power-one sequential tests exist. This paper provides a remarkably general sufficient condition (that we also prove is not necessary). Focusing on i.i.d. laws in Polish spaces without any further restriction, we show that there exists a level-$α$ sequential test for any weakly compact $\mathscr P$, that is power-one against $\mathscr P^c$ (or any subset thereof). We show how to aggregate such tests into an $e$-process for $\mathscr P$ that increases to infinity under $\mathscr P^c$. We conclude by building an $e$-process that is asymptotically relatively growth rate optimal against $\mathscr P^c$, an extremely powerful result.


Graph-Informed Adversarial Modeling: Infimal Subadditivity of Interpolative Divergences

Birmpa, Panagiota, Hall, Eric Joseph

arXiv.org Machine Learning

We study adversarial learning when the target distribution factorizes according to a known Bayesian network. For interpolative divergences, including $(f,Γ)$-divergences, we prove a new infimal subadditivity principle showing that, under suitable conditions, a global variational discrepancy is controlled by an average of family-level discrepancies aligned with the graph. In an additive regime, the surrogate is exact. This closes a theoretical gap in the literature; existing subadditivity results justify graph-informed adversarial learning for classical discrepancies, but not for interpolative divergences, where the usual factorization argument breaks down. In turn, we provide a justification for replacing a standard, graph-agnostic GAN with a monolithic discriminator by a graph-informed GAN (GiGAN) with localized family-level discriminators, without requiring the optimizer itself to factorize according to the graph. We also obtain parallel results for integral probability metrics and proximal optimal transport divergences, identify natural discriminator classes for which the theory applies, and present experiments showing improved stability and structural recovery relative to graph-agnostic baselines.


Statistical Guarantees for Distributionally Robust Optimization with Optimal Transport and OT-Regularized Divergences

Birrell, Jeremiah, Shen, Xiaoxi

arXiv.org Machine Learning

We study finite-sample statistical performance guarantees for distributionally robust optimization (DRO) with optimal transport (OT) and OT-regularized divergence model neighborhoods. Specifically, we derive concentration inequalities for supervised learning via DRO-based adversarial training, as commonly employed to enhance the adversarial robustness of machine learning models. Our results apply to a wide range of OT cost functions, beyond the $p$-Wasserstein case studied by previous authors. In particular, our results are the first to: 1) cover soft-constraint norm-ball OT cost functions; soft-constraint costs have been shown empirically to enhance robustness when used in adversarial training, 2) apply to the combination of adversarial sample generation and adversarial reweighting that is induced by using OT-regularized $f$-divergence model neighborhoods; the added reweighting mechanism has also been shown empirically to further improve performance. In addition, even in the $p$-Wasserstein case, our bounds exhibit better behavior as a function of the DRO neighborhood size than previous results when applied to the adversarial setting.


On the Asymptotics of Self-Supervised Pre-training: Two-Stage M-Estimation and Representation Symmetry

Tinati, Mohammad, Tu, Stephen

arXiv.org Machine Learning

Self-supervised pre-training, where large corpora of unlabeled data are used to learn representations for downstream fine-tuning, has become a cornerstone of modern machine learning. While a growing body of theoretical work has begun to analyze this paradigm, existing bounds leave open the question of how sharp the current rates are, and whether they accurately capture the complex interaction between pre-training and fine-tuning. In this paper, we address this gap by developing an asymptotic theory of pre-training via two-stage M-estimation. A key challenge is that the pre-training estimator is often identifiable only up to a group symmetry, a feature common in representation learning that requires careful treatment. We address this issue using tools from Riemannian geometry to study the intrinsic parameters of the pre-training representation, which we link with the downstream predictor through a notion of orbit-invariance, precisely characterizing the limiting distribution of the downstream test risk. We apply our main result to several case studies, including spectral pre-training, factor models, and Gaussian mixture models, and obtain substantial improvements in problem-specific factors over prior art when applicable.


Discriminative Gaifman Models

Mathias Niepert

Neural Information Processing Systems

Considering local and bounded-size neighborhoods of knowledge bases renders logical inference and learning tractable, mitigates the problem of overfitting, and facilitates weight sharing.


Hassan Took a Bike Ride. Now He's One of the Thousands Missing in Gaza

WIRED

In a place denied access to basic forensic technology--and where people disappear into Israeli detention--the fate of thousands remains unknown. One of them is an autistic teenager. In the early morning dark, Abeer Skaik turned to her husband, Ali Al-Qatta, and said that today would be the day they would find their son. Ali nodded in silence, and she handed him the stack of flyers. Each bore a photograph of 16-year-old Hassan smiling widely, his shoulders loose, wearing a plain red T-shirt. He is looking directly at the camera, unguarded. On top of the page, in large letters, Abeer had written a single word in bold red ink: --an appeal. Abeer watched as Ali stepped into a car with a few close friends and drove away. They started the 30-kilometer trip south, from al-Tuffah, east of Gaza City, to the European Hospital in Khan Younis. They had heard that a group of people detained by Israel, including children, would be released there. The gate was already crowded. Families stood shoulder to shoulder, wrapped in blankets against the cold, clutching photographs and ID cards. Ali distributed the flyers among his friends. When the buses of released detainees arrived, he and the others moved slowly through the narrow gaps between clusters of people. Some of those who had just been released were being pulled into embraces. Ali waited at the edge of each reunion. "Have you seen my son?" he asked. One after another, people shook their heads.


Two Literal Crypto Bros Built a Real Estate Empire. Then the Homes Started to Fall Apart

WIRED

Two Literal Crypto Bros Built a Real Estate Empire. In 2019, two Canadian brothers blew into Detroit with an irresistible pitch: For $50, almost anyone could become a property owner. When houses decayed and the city intervened, the blame games began. A fire broke out at 10410 Cadieux in March 2025, burning a hole in the roof. The smell hit me first: damp brick, stagnant water, mold, and bleach. I was partway down a flight of wooden stairs that led to the basement of a 1920s duplex in east Detroit, Michigan. Leading the way was Cornell Dorris, a tenant in the building for nearly a decade. Dorris is in his early forties, has two daughters who visit on weekends, and makes a living smoking meat and cooking for events. As my eyes adjusted, I made out rodent droppings and a black puddle that spread across the basement floor. "Anytime it rains, the water comes down," Dorris said. The air was unnaturally heavy, and I felt a nagging urge to leave. Dorris doesn't have a typical landlord. Almost four years ago, his building was acquired by a startup called RealToken, or RealT.


Discriminative Gaifman Models

Neural Information Processing Systems

Gaifman models learn feature representations bottom up from representations of locally connected and bounded-size regions of knowledge bases (KBs). Considering local and bounded-size neighborhoods of knowledge bases renders logical inference and learning tractable, mitigates the problem of overfitting, and facilitates weight sharing. Gaifman models sample neighborhoods of knowledge bases so as to make the learned relational models more robust to missing objects and relations which is a common situation in open-world KBs. We present the core ideas of Gaifman models and apply them to large-scale relational learning problems. We also discuss the ways in which Gaifman models relate to some existing relational machine learning approaches.


Non-Local Recurrent Network for Image Restoration

Neural Information Processing Systems

Many classic methods have shown non-local self-similarity in natural images to be an effective prior for image restoration. However, it remains unclear and challenging to make use of this intrinsic property via deep networks. In this paper, we propose a non-local recurrent network (NLRN) as the first attempt to incorporate non-local operations into a recurrent neural network (RNN) for image restoration. The main contributions of this work are: (1) Unlike existing methods that measure self-similarity in an isolated manner, the proposed non-local module can be flexibly integrated into existing deep networks for end-to-end training to capture deep feature correlation between each location and its neighborhood.


KONG: Kernels for ordered-neighborhood graphs

Neural Information Processing Systems

We present novel graph kernels for graphs with node and edge labels that have ordered neighborhoods, i.e. when neighbor nodes follow an order. Graphs with ordered neighborhoods are a natural data representation for evolving graphs where edges are created over time, which induces an order.