Goto

Collaborating Authors

 mathrm


On Agnostic PAC Learning in the Small Error Regime

Neural Information Processing Systems

Binary classification in the classic PAC model exhibits a curious phenomenon: Empirical Risk Minimization (ERM) learners are suboptimal in the realizable case yet optimal in the agnostic case. Roughly speaking, this owes itself to the fact that non-realizable distributions $\\mathcal{D}$ are more difficult to learn than realizable distributions -- even when one discounts a learner's error by $\\mathrm{err}(h^\\ast_\\mathcal{D})$, i.e., the error of the best hypothesis in $\\mathcal{H}$. Thus, optimal agnostic learners are permitted to incur excess error on (easier-to-learn) distributions $\\mathcal{D}$ for which $\\tau = \\mathrm{err}(h^\\ast_\\mathcal{D})$ is small.


MIND: Material Interface Generation from UDFs for Non-Manifold Surface Reconstruction

Neural Information Processing Systems

Unsigned distance fields (UDFs) are widely used in 3D deep learning due to their ability to represent shapes with arbitrary topology. While prior work has largely focused on learning UDFs from point clouds or multi-view images, extracting meshes from UDFs remains challenging, as the learned fields rarely attain exact zero distances. A common workaround is to reconstruct signed distance fields (SDFs) locally from UDFs to enable surface extraction via Marching Cubes. However, this often introduces topological artifacts such as holes or spurious components. Moreover, local SDFs are inherently incapable of representing non-manifold geometry, leading to complete failure in such cases.


Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference

Neural Information Processing Systems

We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at a rate that is nearly independent of explicit dimension dependence. Specifically, for a $d$-dimensional strongly log-concave and log-smooth target, the number of iterations for BBVI with a sub-Gaussian family to obtain a solution $\epsilon$-close to the global optimum has a dimension dependence of $\mathrm{O}(\log d)$. This is a significant improvement over the $\mathrm{O}(d)$ dependence of full-rank location-scale families. For heavy-tailed families, we prove a weaker $\mathrm{O}(d^{2/k})$ dependence, where $k$ is the number of finite moments of the family. Additionally, if the Hessian of the target log-density is constant, the complexity is free of any explicit dimension dependence. We also prove that our bound on the gradient variance, which is key to our result, cannot be improved using only spectral bounds on the Hessian of the target log-density.



INC: An Indirect Neural Corrector for Auto-Regressive Hybrid PDE Solvers

Neural Information Processing Systems

When simulating partial differential equations, hybrid solvers combine coarse numerical solvers with learned correctors. They promise accelerated simulations while adhering to physical constraints. However, as shown in our theoretical framework, directly applying learned corrections to solver outputs leads to significant autoregressive errors, which originate from amplified perturbations that accumulate during long-term rollouts, especially in chaotic regimes. To overcome this, we propose the Indirect Neural Corrector ($\mathrm{INC}$), which integrates learned corrections into the governing equations rather than applying direct state updates. Our key insight is that $\mathrm{INC}$ reduces the error amplification on the order of $\Delta t^{-1} + L$, where $\Delta t$ is the timestep and $L$ the Lipschitz constant. At the same time, our framework poses no architectural requirements and integrates seamlessly with arbitrary neural networks and solvers. We test $\mathrm{INC}$ in extensive benchmarks, covering numerous differentiable solvers, neural backbones, and test cases ranging from a 1D chaotic system to 3D turbulence. INC improves the long-term trajectory performance ($R^2$) by up to 158.7\%, stabilizes blowups under aggressive coarsening, and for complex 3D turbulence cases yields speed-ups of several orders of magnitude. INC thus enables stable, efficient PDE emulation with formal error reduction, paving the way for faster scientific and engineering simulations with reliable physics guarantees.


Provable Gradient Editing of Deep Neural Networks

Neural Information Processing Systems

In explainable AI, DNN gradients are used to interpret the prediction; in safety-critical control systems, gradients could encode safety constraints; in scientific-computing applications, gradients could encode physical invariants. While recent work on provable editing of DNNs has focused on input-output constraints, the problem of enforcing hard constraints on DNN gradients remains unaddressed. We present ProGrad, the first efficient approach for editing the parameters of a DNN to provably enforce hard constraints on the DNN gradients.


Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

Neural Information Processing Systems

This paper investigates infinite-horizon average reward Constrained Markov Decision Processes (CMDPs) under general parametrized policies with smooth and bounded policy gradients. We propose a Primal-Dual Natural Actor-Critic algorithm that adeptly manages constraints while ensuring a high convergence rate. In particular, our algorithm achieves global convergence and constraint violation rates of $\tilde{\mathcal{O}}(1/\sqrt{T})$ over a horizon of length $T$ when the mixing time, $\tau_{\mathrm{mix}}$, is known to the learner.


Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning

Neural Information Processing Systems

Motivated by practical applications where stable long-term performance is critical--such as robotics, operations research, and healthcare--we study the problem of distributionally robust (DR) average-reward reinforcement learning. We propose two algorithms that achieve near-optimal sample complexity. The first reduces the problem to a DR discounted Markov decision process (MDP), while the second, Anchored DR Average-Reward MDP, introduces an anchoring state to stabilize the controlled transition kernels within the uncertainty set. Assuming the nominal MDP is uniformly ergodic, we prove that both algorithms attain a sample complexity of $\widetilde{O}\left(|\mathbf{S}||\mathbf{A}| t_{\mathrm{mix}}^2\varepsilon^{-2}\right)$ for estimating the optimal policy as well as the robust average reward under KL and $f_k$-divergence-based uncertainty sets, provided the uncertainty radius is sufficiently small. Here, $\varepsilon$ is the target accuracy, $|\mathbf{S}|$ and $|\mathbf{A}|$ denote the sizes of the state and action spaces, and $t_{\mathrm{mix}}$ is the mixing time of the nominal MDP. This represents the first finite-sample convergence guarantee for DR average-reward reinforcement learning.


Tradeoffs between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning

Neural Information Processing Systems

We study online and transductive online learning in settings where the learner can interact with the concept class only via Empirical Risk Minimization (ERM) or weak consistency oracles on arbitrary subsets of the instance domain. This contrasts with standard online models, where the learner has full knowledge of the concept class. The ERM oracle returns a hypothesis that minimizes the loss on a given subset, while the weak consistency oracle returns only a binary signal indicating whether the subset is realizable by a concept in the class. The learner's performance is measured by the number of mistakes and oracle calls. In the standard online setting with ERM access, we establish tight lower bounds in both the realizable and agnostic cases: $\Omega(2^{d_\mathrm{LD}})$ mistakes and $\Omega(\sqrt{T 2^{d_\mathrm{LD}}})$ regret, respectively, where $T$ is the number of timesteps and $d_\mathrm{LD}$ is the Littlestone dimension of the class. We further show how existing results for online learning with ERM access translate to the setting with a weak consistency oracle, at the cost of increasing the number of oracle calls by $O(T)$. We then consider the transductive online model, where the instance sequence is known in advance but labels are revealed sequentially. For general Littlestone classes, we show that the optimal mistake bound in the realizable case and in the agnostic case can be achieved using $O(T^{d_\mathrm{VC}+1})$ weak consistency oracle calls, where $d_\mathrm{VC}$ is the VC dimension of the class. On the negative side, we show that $\Omega(T)$ weak consistency queries are necessary for transductive online learnability, and that $\Omega(T)$ ERM queries are necessary to avoid exponential dependence on the Littlestone dimension.


Balancing Positive and Negative Classification Error Rates in Positive-Unlabeled Learning

Neural Information Processing Systems

Positive and Unlabeled (PU) learning is a special case of binary classification with weak supervision, where only positive labeled and unlabeled data are available. Previous studies suggest several specific risk estimators of PU learning such as non-negative PU (nnPU), which are unbiased and consistent with the expected risk of supervised binary classification. In nnPU, the negative-class empirical risk is estimated by positive labeled and unlabeled data with a non-negativity constraint. However, its negative-class empirical risk estimator approaches 0, so the negative class is over-played, resulting in imbalanced error rates between positive and negative classes. To solve this problem, we suppose that the expected risks of the positive-class and negative-class should be close. Accordingly, we constrain that the negative-class empirical risk estimator is lower bounded by the positive-class empirical risk, instead of 0; and also incorporate an explicit equality constraint between them.