Bayesian Learning
Robust Bayesian Regression via Hard Thresholding Zheyi Fan
By combining robust regression and prior information, we develop an effective robust regression method that can resist adaptive adversarial attacks. Due to the widespread existence of noise and data corruption, it is necessary to recover the true regression parameters when a certain proportion of the response variables have been corrupted. Methods to overcome this problem often involve robust least-squares regression. However, few methods achieve good performance when dealing with severe adaptive adversarial attacks. Based on the combination of prior information and robust regression via hard thresholding from [ 1 ], this paper proposes an algorithm that improves the breakdown point when facing adaptive adversarial attacks. Furthermore, to improve the robustness and reduce the estimation error caused by the inclusion of a prior, the idea of Bayesian reweighting is used to construct a more robust algorithm. We prove the theoretical convergence of proposed algorithms under mild conditions. Extensive experiments show that, under different dataset attacks, our algorithms achieve state-of-the-art results compared with other benchmark algorithms, demonstrating the robustness of the proposed approach.
A Standard Maximum Likelihood Estimation and Links to I
In the standard MLE setting [see, e.g., Murphy, 2012, Ch. 9] we are interested in learning the These two definitions are, however, essentially equivalent. Eq. (15) is a smooth objective that can be optimized with a (stochastic) gradient descent procedure. This section contains the proofs of the results relative to the perturb and map section (Section 3.2) and The proposition now follows from arguments made in Papandreou and Y uille [2011] Its moment generating function has the form E[exp(tX)] = Γ(1 τt). As mentioned in Johnson and Balakrishnan [p. Parts of the proof are inspired by a post on stackexchange Xi'an [2016].Theorem 1.
Supplementary Material S1 Pseudocode Algorithm 1 gives pseudocode for autofocusing a broad class of model-based optimization (MBO)
"E-step" (Steps 1 and 2 in Algorithm 1) and a weighted maximum likelihood estimation (MLE) "M-step" (Step 3; see [ ( t 1) (t 1) One may use these in a number of different ways. The following observation is due to Chebyshev's inequality. One can use Proposition S2.1 to construct a confidence interval on, for example, the expected squared Note that 1) the bound in Proposition S2.1 is CbAS naturally controls the importance weight variance. Design procedures that leverage a trust region can naturally bound the variance of the importance weights. We used CbAS as follows.