semigroup
Privacy Amplification by Mixing and Diffusion Mechanisms
Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek
A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. On the applied side, we generalize the analysis of "privacy amplification by iteration" in Noisy SGD and show it admits an exponential improvement in the strongly convex case, and study a mechanism based on the Ornstein-Uhlenbeck diffusion process which contains the Gaussian mechanism with optimal post-processing on bounded inputs as a special case.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
Thompson sampling: Precise arm-pull dynamics and adaptive inference
Adaptive sampling schemes are well known to create complex dependence that may invalidate conventional inference methods. A recent line of work shows that this need not be the case for UCB-type algorithms in multi-armed bandits. A central emerging theme is a `stability' property with asymptotically deterministic arm-pull counts in these algorithms, making inference as easy as in the i.i.d. setting. In this paper, we study the precise arm-pull dynamics in another canonical class of Thompson-sampling type algorithms. We show that the phenomenology is qualitatively different: the arm-pull count is asymptotically deterministic if and only if the arm is suboptimal or is the unique optimal arm; otherwise it converges in distribution to the unique invariant law of an SDE. This dichotomy uncovers a unifying principle behind many existing (in)stability results: an arm is stable if and only if its interaction with statistical noise is asymptotically negligible. As an application, we show that normalized arm means obey the same dichotomy, with Gaussian limits for stable arms and a semi-universal, non-Gaussian limit for unstable arms. This not only enables the construction of confidence intervals for the unknown mean rewards despite non-normality, but also reveals the potential of developing tractable inference procedures beyond the stable regime. The proofs rely on two new approaches. For suboptimal arms, we develop an `inverse process' approach that characterizes the inverse of the arm-pull count process via a Stieltjes integral. For optimal arms, we adopt a reparametrization of the arm-pull and noise processes that reduces the singularity in the natural SDE to proving the uniqueness of the invariant law of another SDE. We prove the latter by a set of analytic tools, including the parabolic Hörmander condition and the Stroock-Varadhan support theorem.
- North America > United States > California > Alameda County > Berkeley (0.27)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.45)
A Hybrid Framework for Healing Semigroups with Machine Learning
Sirikonda, Sarayu, van de Kreeke, Jasper
In this paper, we propose a hybrid framework that heals corrupted finite semigroups, combining deterministic repair strategies with Machine Learning using a Random Forest Classifier. Corruption in these tables breaks associativity and invalidates the algebraic structure. Deterministic methods work for small cardinality n and low corruption but degrade rapidly. Our experiments, carried out on Mace4-generated data sets, demonstrate that our hybrid framework achieves higher healing rates than deterministic-only and ML-only baselines. At a corruption percentage of p=15%, our framework healed 95% of semigroups up to cardinality n=6 and 60% at n=10.
Prediction error certification for PINNs: Theory, computation, and application to Stokes flow
Hillebrecht, Birgit, Unger, Benjamin
Rigorous error estimation is a fundamental topic in numerical analysis. With the increasing use of physics-informed neural networks (PINNs) for solving partial differential equations, several approaches have been developed to quantify the associated prediction error. In this work, we build upon a semigroup-based framework previously introduced by the authors for estimating the PINN error. While this estimator has so far been limited to academic examples - due to the need to compute quantities related to input-to-state stability - we extend its applicability to a significantly broader class of problems. This is accomplished by modifying the error bound and proposing numerical strategies to approximate the required stability parameters. The extended framework enables the certification of PINN predictions in more realistic scenarios, as demonstrated by a numerical study of Stokes flow around a cylinder.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- (2 more...)
Generalization Through Growth: Hidden Dynamics Controls Depth Dependence
Sonoda, Sho, Hashimoto, Yuka, Ishikawa, Isao, Ikeda, Masahiro
Recent theory has reduced the depth dependence of generalization bounds from exponential to polynomial and even depth-independent rates, yet these results remain tied to specific architectures and Euclidean inputs. We present a unified framework for arbitrary \blue{pseudo-metric} spaces in which a depth-\(k\) network is the composition of continuous hidden maps \(f:\mathcal{X}\to \mathcal{X}\) and an output map \(h:\mathcal{X}\to \mathbb{R}\). The resulting bound $O(\sqrt{(α+ \log β(k))/n})$ isolates the sole depth contribution in \(β(k)\), the word-ball growth of the semigroup generated by the hidden layers. By Gromov's theorem polynomial (resp. exponential) growth corresponds to virtually nilpotent (resp. expanding) dynamics, revealing a geometric dichotomy behind existing $O(\sqrt{k})$ (sublinear depth) and $\tilde{O}(1)$ (depth-independent) rates. We further provide covering-number estimates showing that expanding dynamics yield an exponential parameter saving via compositional expressivity. Our results decouple specification from implementation, offering architecture-agnostic and dynamical-systems-aware guarantees applicable to modern deep-learning paradigms such as test-time inference and diffusion models.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- (2 more...)
A Malliavin-Gamma calculus approach to Score Based Diffusion Generative models for random fields
We adopt a Gamma and Malliavin Calculi point of view in order to generalize Score-based diffusion Generative Models (SGMs) to an infinite-dimensional abstract Hilbertian setting. Particularly, we define the forward noising process using Dirichlet forms associated to the Cameron-Martin space of Gaussian measures and Wiener chaoses; whereas by relying on an abstract time-reversal formula, we show that the score function is a Malliavin derivative and it corresponds to a conditional expectation. This allows us to generalize SGMs to the infinite-dimensional setting. Moreover, we extend existing finite-dimensional entropic convergence bounds to this Hilbertian setting by highlighting the role played by the Cameron-Martin norm in the Fisher information of the data distribution. Lastly, we specify our discussion for spherical random fields, considering as source of noise a Whittle-Matérn random spherical field.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- (3 more...)
Understanding the Generalization Error of Markov algorithms through Poissonization
Dupuis, Benjamin, Haddouche, Maxime, Deligiannidis, George, Simsekli, Umut
Using continuous-time stochastic differential equation (SDE) proxies to stochastic optimization algorithms has proven fruitful for understanding their generalization abilities. A significant part of these approaches are based on the so-called ``entropy flows'', which greatly simplify the generalization analysis. Unfortunately, such well-structured entropy flows cannot be obtained for most discrete-time algorithms, and the existing SDE approaches remain limited to specific noise and algorithmic structures. We aim to alleviate this issue by introducing a generic framework for analyzing the generalization error of Markov algorithms through `Poissonization', a continuous-time approximation of discrete-time processes with formal approximation guarantees. Through this approach, we first develop a novel entropy flow, which directly leads to PAC-Bayesian generalization bounds. We then draw novel links to modified versions of the celebrated logarithmic Sobolev inequalities (LSI), identify cases where such LSIs are satisfied, and obtain improved bounds. Beyond its generality, our framework allows exploiting specific properties of learning algorithms. In particular, we incorporate the noise structure of different algorithm types - namely, those with additional noise injections (noisy) and those without (non-noisy) - through various technical tools. This illustrates the capacity of our methods to achieve known (yet, Poissonized) and new generalization bounds.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- (2 more...)
Noncommutative Model Selection and the Data-Driven Estimation of Real Cohomology Groups
Guzmán-Tristán, Araceli, Rieser, Antonio, Velázquez-Richards, Eduardo
We propose three completely data-driven methods for estimating the real cohomology groups $H^k (X ; \mathbb{R})$ of a compact metric-measure space $(X, d_X, \mu_X)$ embedded in a metric-measure space $(Y,d_Y,\mu_Y)$, given a finite set of points $S$ sampled from a uniform distrbution $\mu_X$ on $X$, possibly corrupted with noise from $Y$. We present the results of several computational experiments in the case that $X$ is embedded in $\mathbb{R}^n$, where two of the three algorithms performed well.
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > Mexico > Guanajuato (0.04)
- North America > United States > South Carolina > Richland County > Columbia (0.04)
- (7 more...)
Unbalanced Diffusion Schr\"odinger Bridge
Pariset, Matteo, Hsieh, Ya-Ping, Bunne, Charlotte, Krause, Andreas, De Bortoli, Valentin
Schr\"odinger bridges (SBs) provide an elegant framework for modeling the temporal evolution of populations in physical, chemical, or biological systems. Such natural processes are commonly subject to changes in population size over time due to the emergence of new species or birth and death events. However, existing neural parameterizations of SBs such as diffusion Schr\"odinger bridges (DSBs) are restricted to settings in which the endpoints of the stochastic process are both probability measures and assume conservation of mass constraints. To address this limitation, we introduce unbalanced DSBs which model the temporal evolution of marginals with arbitrary finite mass. This is achieved by deriving the time reversal of stochastic differential equations with killing and birth terms. We present two novel algorithmic schemes that comprise a scalable objective function for training unbalanced DSBs and provide a theoretical analysis alongside challenging applications on predicting heterogeneous molecular single-cell responses to various cancer drugs and simulating the emergence and spread of new viral variants.
- North America > United States (0.27)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > India (0.04)
Expressiveness Remarks for Denoising Diffusion Models and Samplers
Vargas, Francisco, Reu, Teodora, Kerekes, Anna
Denoising diffusion models are a class of generative models which have recently achieved state-of-the-art results across many domains. Gradual noise is added to the data using a diffusion process, which transforms the data distribution into a Gaussian. Samples from the generative model are then obtained by simulating an approximation of the time reversal of this diffusion initialized by Gaussian samples. Recent research has explored adapting diffusion models for sampling and inference tasks. In this paper, we leverage known connections to stochastic control akin to the F\"ollmer drift to extend established neural network approximation results for the F\"ollmer drift to denoising diffusion models and samplers.