stability estimate
Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections
Optimal transport maps between two probability distributions $\mu$ and $\nu$ on $\R^d$ have found extensive applications in both machine learning and statistics. In practice, these maps need to be estimated from data sampled according to $\mu$ and $\nu$. Plug-in estimators are perhaps most popular in estimating transport maps in the field of computational optimal transport. In this paper, we provide a comprehensive analysis of the rates of convergences for general plug-in estimators defined via barycentric projections. Our main contribution is a new stability estimate for barycentric projections which proceeds under minimal smoothness assumptions and can be used to analyze general plug-in estimators. We illustrate the usefulness of this stability estimate by first providing rates of convergence for the natural discrete-discrete and semi-discrete estimators of optimal transport maps.
Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity
Greco, Giacomo, Tamanini, Luca
In this paper we determine quantitative stability bounds fo r the Hessian of entropic potentials, i.e., the dual solution to the entropic optimal transport proble m. Up to authors' knowledge this is the first work addressing this second-orde r quantitative stability estimate in general unbounded settings. Our proof strategy relies on se miconcavity properties of entropic potentials and on the representation of entropic transport plans as laws of forward and backward diffusion processes, known as Schr odinger bridges. Moreov er, our approach allows to deduce a stochastic proof of quantitative stability entropic estim ates and integrated gradient estimates as well. Finally, as a direct consequence of these stability bounds, we deduce exponential convergence rates for gradient and Hessian of Sinkhorn iter ates along Sinkhorn's algorithm, a problem that was still open in unbounded settings. Our rates have a polynomial dependence on the regularization parameter.
Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections
Optimal transport maps between two probability distributions \mu and u on \R d have found extensive applications in both machine learning and statistics. In practice, these maps need to be estimated from data sampled according to \mu and u . Plug-in estimators are perhaps most popular in estimating transport maps in the field of computational optimal transport. In this paper, we provide a comprehensive analysis of the rates of convergences for general plug-in estimators defined via barycentric projections. Our main contribution is a new stability estimate for barycentric projections which proceeds under minimal smoothness assumptions and can be used to analyze general plug-in estimators.
Improving Stability Estimates in Adversarial Explainable AI through Alternate Search Methods
Burger, Christopher, Walter, Charles
Advances in the effectiveness of machine learning models have come at the cost of enormous complexity resulting in a poor understanding of how they function. Local surrogate methods have been used to approximate the workings of these complex models, but recent work has revealed their vulnerability to adversarial attacks where the explanation produced is appreciably different while the meaning and structure of the complex model's output remains similar. This prior work has focused on the existence of these weaknesses but not on their magnitude. Here we explore using an alternate search method with the goal of finding minimum viable perturbations, the fewest perturbations necessary to achieve a fixed similarity value between the original and altered text's explanation. Intuitively, a method that requires fewer perturbations to expose a given level of instability is inferior to one which requires more. This nuance allows for superior comparisons of the stability of explainability methods.
Physics-Informed Deep Inverse Operator Networks for Solving PDE Inverse Problems
Inverse problems involving partial differential equations (PDEs) can be seen as discovering a mapping from measurement data to unknown quantities, often framed within an operator learning approach. However, existing methods typically rely on large amounts of labeled training data, which is impractical for most real-world applications. Moreover, these supervised models may fail to capture the underlying physical principles accurately. To address these limitations, we propose a novel architecture called Physics-Informed Deep Inverse Operator Networks (PI-DIONs), which can learn the solution operator of PDE-based inverse problems without labeled training data. We extend the stability estimates established in the inverse problem literature to the operator learning framework, thereby providing a robust theoretical foundation for our method. These estimates guarantee that the proposed model, trained on a finite sample and grid, generalizes effectively across the entire domain and function space. Extensive experiments are conducted to demonstrate that PI-DIONs can effectively and accurately learn the solution operators of the inverse problems without the need for labeled data. Deep learning has revolutionized numerous fields, from natural language processing to computer vision, due to its ability to model complex patterns in large datasets (LeCun et al., 2015).
Understanding Influence Functions and Datamodels via Harmonic Analysis
Saunshi, Nikunj, Gupta, Arushi, Braverman, Mark, Arora, Sanjeev
It is often of great interest to quantify how the presence or absence of a particular training data point affects the trained model's performance on test data points. Influence functions is a classical idea for this [Jaeckel, 1972, Hampel, 1974, Cook, 1977] that has recently been adapted to modern deep models and large datasets Koh and Liang [2017]. Influence functions have been applied to explain predictions and produce confidence intervals [Schulam and Saria, 2019], investigate model bias [Brunet et al., 2019, Wang et al., 2019], estimate Shapley values [Jia et al., 2019, Ghorbani and Zou, 2019], improve human trust [Zhou et al., 2019], and craft data poisoning attacks [Koh et al., 2019]. Influence actually has different formalizations. The classic calculus-based estimate (henceforth referred to as continuous influence) involves conceptualizing training loss as a weighted sum over training datapoints, where the weighting of a particular datapoint z can be varied infinitesimally.