A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning
Aravkin, Aleksandr, Davis, Damek
Noname manuscript No. (will be inserted by the editor) Abstract In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorithm that incorporates prior knowledge through nonsmooth regularization. Keywords Stochastic algorithms · Nonsmooth, nonconvex optimization · Trimmed estimators 1 Introduction Potential outliers in datasets can be identified in several ways. This work was funded by the Washington Research Foundation Data Science Professorship. This material is based upon work supported by the National Science Foundation under Award No. 1502405. A. Aravkin Department of Applied Mathematics University of Washington Seattle, WA 98195-4322, USA Email: saravkin@uw.edu For higher-dimensional data, several tests involving order statistics exist (so called L-estimators [23]), such as the three-sigma rule for Gaussian data, or trimming strategies for disregarding points that are furthest away from the mean. After potential outliers are removed from a dataset, models are fit on the remaining data. After fitting the model, potential outliers are again identified and removed and another model is fit [33].
Feb-5-2017
- Country:
- North America > United States > Washington > King County > Seattle (0.54)
- Genre:
- Research Report (1.00)
- Technology: