Goto

Collaborating Authors

 proximal gradient method


Geometric Descent Method for Convex Composite Minimization

Neural Information Processing Systems

In this paper, we extend the geometric descent method recently proposed by Bubeck, Lee and Singh to tackle nonsmooth and strongly convex composite problems. We prove that our proposed algorithm, dubbed geometric proximal gradient method (GeoPG), converges with a linear rate $(1-1/\sqrt{\kappa})$ and thus achieves the optimal rate among first-order methods, where $\kappa$ is the condition number of the problem. Numerical results on linear regression and logistic regression with elastic net regularization show that GeoPG compares favorably with Nesterov's accelerated proximal gradient method, especially when the problem is ill-conditioned.




Geometric Descent Method for Convex Composite Minimization

Neural Information Processing Systems

In this paper, we extend the geometric descent method recently proposed by Bubeck, Lee and Singh to tackle nonsmooth and strongly convex composite problems. We prove that our proposed algorithm, dubbed geometric proximal gradient method (GeoPG), converges with a linear rate $(1-1/\sqrt{\kappa})$ and thus achieves the optimal rate among first-order methods, where $\kappa$ is the condition number of the problem. Numerical results on linear regression and logistic regression with elastic net regularization show that GeoPG compares favorably with Nesterov's accelerated proximal gradient method, especially when the problem is ill-conditioned.



Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition

Mingrui Liu, Tianbao Yang

Neural Information Processing Systems

Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC). However, the faster convergence of restarting APG method relies on the potentially unknown constant in QGC to appropriately restart APG, which restricts its applicability.


41ae36ecb9b3eee609d05b90c14222fb-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Please also discuss on how this can be extended to the analysis of ADMM. This paper is an extension of Tseng [20], Tseng and Yun A coordinate gradient descent method for nonsmooth separable minimization and Zhang et al. [22], which established the same result using the error-bound condition for lasso and group lasso, to the trace norm. This is a non-trivial extension but the contribution seems purely technical. The presentation of the proofs is mostly clear.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper introduces a novel convex region-specific linear models called partition-wise linear model. It assigns linear models to partitions of the input space and linear combination of these partition-specific models define the region-specific linear models. This allows them to construct convex objective functions. They optimize both the regions and predictors by using sparsity inducing structured penalties.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper investigate fast convergence properties of proximal gradient method and proximal Newton method under the assumption of Constant Nullspace Strong Convexity (CNSC). The problem of interest is to minimize the sum of two convex functions f(x)+h(x), where f is twice differentiable (smooth) and h can be non-smooth but admits a simple proximal mapping. Under the CNSC assumption on f and assuming h has the form of decomposable norm, this paper showed global geometric convergence of the proximal gradient method, and local quadratic convergence of the proximal Newton method. Writing of this paper is very clear.