Performance Limits of Stochastic Sub-Gradient Learning, Part I: Single Agent Case

Ying, Bicheng, Sayed, Ali H.

arXiv.org Machine Learning 

The minimization of non-differentiable convex cost functions is a critical step in the solution of many design problems [3]-[5], including the design of sparse-aware (LASSO) solutions [6], [7], support-vector machine (SVM) learners [8]-[12], or total-variation-based image denoising solutions [13], [14]. Several powerful techniques have been proposed in the literature to deal with the non-differentiability aspect of the problem formulation, including methods that employ sub-gradient iterations [3]-[5], cutting-plane techniques [15], or proximal iterations [16], [17]. This work focuses on the class of sub-gradient methods for the reasons explained in the sequel. The sub-gradient technique is closely related to the traditional gradient-descent method [3], [4] where the actual gradient is replaced by a sub-gradient at points of nondifferentiability. It is one of the simplest methods in current practice but is known to suffer from slow convergence. For instance, it is shown in [5] that, for convex cost functions, the optimal convergence rate that can be delivered by sub-gradient methods in deterministic optimization problems cannot be faster than O(1/ i) under worst case conditions, where i is the iteration index.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found