Goto

Collaborating Authors

 Constraint-Based Reasoning


Constrained Discrete Diffusion

Neural Information Processing Systems

Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. Beyond their rapidly growing ability to generate coherent natural language, these models present a new and important opportunity to enforce sequence-level constraints, a capability that current autoregressive models cannot natively provide. This paper capitalizes on this opportunity by introducing $\textit{Constrained Discrete Diffusion}$ (CDD), a novel integration of differentiable constraint optimization within the diffusion process to ensure adherence to constraints, logic rules, or safety requirements for generated sequences. Unlike conventional text generators that often rely on post-hoc filtering or model retraining for controllable generation, CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach. Experiments in toxicity-controlled text generation, property-constrained molecule design, and instruction-constrained text completion demonstrate that CDD achieves $\textit{zero constraint violations}$ in a diverse array of tasks while preserving fluency, novelty, and coherence, and outperforming autoregressive and existing discrete diffusion approaches.


Data-Dependent Regret Bounds for Constrained MABs

Neural Information Processing Systems

This paper initiates the study of data-dependent regret bounds in constrained MAB settings. These are bounds that depend on the sequence of losses that characterize the problem instance. Thus, in principle they can be much smaller than classical $\widetilde{\mathcal{O}}(\sqrt{T})$ regret bounds, while being equivalent to them in the worst case. Despite this, data-dependent regret bounds have been completely overlooked in constrained MABs. The goal of this paper is to answer the question: Can data-dependent regret bounds be derived in the presence of constraints? We provide an affirmative answer in constrained MABs with adversarial losses and stochastic constraints. Specifically, our main focus is on the most challenging and natural settings with hard constraints, where the learner must ensure that the constraints are always satisfied with high probability. We design an algorithm with a regret bound consisting of two data-dependent terms.


Controlling False Discovery in Arbitrarily Structured Hypothesis Spaces via Reproducing Kernels

arXiv.org Machine Learning

Large-scale hypothesis testing is central to modern science, where controlling the False Discovery Rate (FDR) has become the standard approach to managing false positives across many simultaneous tests. Hypotheses rarely exist in isolation; they often exhibit structure through proximity, connectivity, or hierarchy. This structure represents both a challenge and an opportunity: while classical methods treat these dependencies as obstacles requiring conservative correction, leveraging them can substantially increase discovery power. Here, we reframe structured FDR control as a regularized learning problem. By optimizing within a suitable Reproducing Kernel Hilbert Space (RKHS), we introduce a framework that unifies continuous domains, graphs, and hierarchies under a single algorithm through kernel choice alone. This formulation enables smooth solutions in place of the piecewise-constant fits of prior methods, principled likelihood-based hyperparameter selection rather than heuristic tuning, and inference at unobserved locations which in turn supports sample-efficient experimental design. Building on this estimator, we provide two decision rules which we prove to control the FDR. We validate our method on two sources: spatial locations derived from high-dimensional real-world datasets, and a differential gene expression task utilizing protein-protein interaction graphs.


Tight Generalization Bounds for Noiseless Inverse Optimization

arXiv.org Machine Learning

Inverse optimization (IO) seeks to infer the parameters of a decision-maker's objective from observed context--action data. We study noiseless IO, where demonstrations are generated by a ground-truth objective. We provide a high-probability ${O}(\frac{d}{T})$ generalization bound for the induced action set, where $d$ is the number of unknown parameters and $T$ is the size of the training dataset. We strengthen these guarantees under additional conditions that ensure uniqueness of the chosen action, bringing our IO guarantees in line with best-arm identification results in the bandit literature. We further show that the ${O}(\frac{d}{T})$ rate is tight over all consistent estimators considered here, and extend the result to both instantaneous and cumulative regret. Notably, the resulting regret lower bound matches the corresponding upper bounds in the adversarial setting, indicating that the stochastic IO setting is effectively adversarial for the class of estimators studied here. Finally, we propose a parameter-free algorithm with lower per-iteration complexity than generic solvers. Experiments validate the predicted rates and illustrate the tightness of our bounds.


Private estimation algorithms for stochastic block models and mixture models

Neural Information Processing Systems

We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private algorithms. To illustrate our techniques, we consider two problems: recovery of stochastic block models and learning mixtures of spherical Gaussians. For the former, we present the first efficient (ฮต,ฮด)-differentially private algorithms for both weak recovery and exact recovery. Previously known algorithms achieving comparable guarantees required quasi-polynomial time. We complement these results with an information-theoretic lower bound that highlights how the guarantees of our algorithms are almost tight. For the latter, we design an (ฮต,ฮด)-differentially private algorithm that recovers the centers of the k-mixture when the minimum separation is at least O(k1/t t). For all choices of t, this algorithm requires sample complexity n kO(1)dO(t) and time complexity (nd)O(t). Prior work required either an additional additive โ„ฆ( logn) term in the minimum separation or an explicit upper bound on the Euclidean norm of the centers.