nlog
Computation-Utility-Privacy Tradeoffs in Bayesian Estimation
Chen, Sitan, Ding, Jingqiu, Majid, Mahbod, McKelvie, Walter
Bayesian methods lie at the heart of modern data science and provide a powerful scaffolding for estimation in data-constrained settings and principled quantification and propagation of uncertainty. Yet in many real-world use cases where these methods are deployed, there is a natural need to preserve the privacy of the individuals whose data is being scrutinized. While a number of works have attempted to approach the problem of differentially private Bayesian estimation through either reasoning about the inherent privacy of the posterior distribution or privatizing off-the-shelf Bayesian methods, these works generally do not come with rigorous utility guarantees beyond low-dimensional settings. In fact, even for the prototypical tasks of Gaussian mean estimation and linear regression, it was unknown how close one could get to the Bayes-optimal error with a private algorithm, even in the simplest case where the unknown parameter comes from a Gaussian prior. In this work, we give the first efficient algorithms for both of these problems that achieve mean-squared error $(1+o(1))\mathrm{OPT}$ and additionally show that both tasks exhibit an intriguing computational-statistical gap. For Bayesian mean estimation, we prove that the excess risk achieved by our method is optimal among all efficient algorithms within the low-degree framework, yet is provably worse than what is achievable by an exponential-time algorithm. For linear regression, we prove a qualitatively similar lower bound. Our algorithms draw upon the privacy-to-robustness framework of arXiv:2212.05015, but with the curious twist that to achieve private Bayes-optimal estimation, we need to design sum-of-squares-based robust estimators for inherently non-robust objects like the empirical mean and OLS estimator. Along the way we also add to the sum-of-squares toolkit a new kind of constraint based on short-flat decompositions.
- Europe (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
SparseDeepLearning: ANewFrameworkImmune toLocalTrapsandMiscalibration
Dn) 1 as n, which means the most posterior mass falls in the neighbourhood of true parameter. Remarkonthenotation: ν() is similar toν() defined in Section 2.1 of the main text. Thenotationsweusedinthis proof are the same as in the proof of Theorem 2.1. Theorem 2.2 implies that a faithful prediction interval can be constructed for the sparse neural network learned by the proposed algorithms. In practice, for a normal regression problem with noise N(0,σ2), to construct the prediction interval for a test pointx0, the terms σ2 and Σ = γ µ(β,x0)TH 1 γ µ(β,x0) in Theorem 2.2 need to be estimated from data.
- Europe > Austria > Vienna (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (10 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
P(x,y)dy=1 x K. Wedefinethefollowingstates,whichcapturekeypropertiesofthequantumwalk: |ψxi: = Z
In this section, we first define the quantum walk operators and introduce some spectral properties. Then,theeigenvaluesofW are n 1,λj q 1 λ2ji o . Let {(λi,fi)} be the set of eigenvalues and eigenfunctions of P, and |ψii be the eigenvectors of the corresponding quantum walk operator W. Let ρ0 be a probability density that is a warm start for ρ and mixes up to TV-distancein t steps of M. Furthermore, assume that kρ/ρ0k = R Theorem 5 (Quantum walk implementation cost). Let M0,M1 be two ergodic reversible Markov chains with stationary distributions π0,π1, respectively. Suppose π0 is β0-warm with respect to M1 and mixes up to total variation distancein t0() steps.
The empirical median for estimating the common mean of heteroscedastic random variables
We study the problem of mean estimation in the heteroscedastic setting. In particular, we consider symmetric random variables having the same location parameter and different and unknown scale parameters. Our goal is then to estimate their unknown common location parameter. It is an elementary topic but yet a not very well-studied one since we always make the assumption that the random variables are independent and identically distributed. In this paper, we study the median estimator and we establish upper and lower bounds on its estimation error that are of the same order and that generalize and improve recent results of Devroye et al. and Xia.
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- North America > United States > California (0.04)
- Europe > France (0.04)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Data Science > Data Mining (0.49)
Robust Gradient Descent for Phase Retrieval
Buna, Alex, Rebeschini, Patrick
Recent progress in robust statistical learning has mainly tackled convex problems, like mean estimation or linear regression, with non-convex challenges receiving less attention. Phase retrieval exemplifies such a non-convex problem, requiring the recovery of a signal from only the magnitudes of its linear measurements, without phase (sign) information. While several non-convex methods, especially those involving the Wirtinger Flow algorithm, have been proposed for noiseless or mild noise settings, developing solutions for heavy-tailed noise and adversarial corruption remains an open challenge. In this paper, we investigate an approach that leverages robust gradient descent techniques to improve the Wirtinger Flow algorithm's ability to simultaneously cope with fourth moment bounded noise and adversarial contamination in both the inputs (covariates) and outputs (responses). We address two scenarios: known zero-mean noise and completely unknown noise. For the latter, we propose a preprocessing step that alters the problem into a new format that does not fit traditional phase retrieval approaches but can still be resolved with a tailored version of the algorithm for the zero-mean noise context.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report (0.64)
- Workflow (0.49)
Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks
Patel, Vivak, Varner, Christian
The presence of non-convexity in smooth optimization problems arising from deep learning have sparked new smoothness conditions in the literature and corresponding convergence analyses. We discuss these smoothness conditions, order them, provide conditions for determining whether they hold, and evaluate their applicability to training a deep linear neural network for binary classification.
- North America > United States > Wisconsin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Hampshire > Hillsborough County > Nashua (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)