AITopics | Gunn, Steve R.

A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization

Bi, Jia, Gunn, Steve R.

arXiv.org Artificial IntelligenceFeb-19-2021

In this paper, we proposed a new technique, {\em variance controlled stochastic gradient} (VCSG), to improve the performance of the stochastic variance reduced gradient (SVRG) algorithm. To avoid over-reducing the variance of gradient by SVRG, a hyper-parameter $\lambda$ is introduced in VCSG that is able to control the reduced variance of SVRG. Theory shows that the optimization method can converge by using an unbiased gradient estimator, but in practice, biased gradient estimation can allow more efficient convergence to the vicinity since an unbiased approach is computationally more expensive. $\lambda$ also has the effect of balancing the trade-off between unbiased and biased estimations. Secondly, to minimize the number of full gradient calculations in SVRG, a variance-bounded batch is introduced to reduce the number of gradient calculations required in each iteration. For smooth non-convex functions, the proposed algorithm converges to an approximate first-order stationary point (i.e. $\mathbb{E}\|\nabla{f}(x)\|^{2}\leq\epsilon$) within $\mathcal{O}(min\{1/\epsilon^{3/2},n^{1/4}/\epsilon\})$ number of stochastic gradient evaluations, which improves the leading gradient complexity of stochastic gradient-based method SCS $(\mathcal{O}(min\{1/\epsilon^{5/3},n^{2/3}/\epsilon\})$. It is shown theoretically and experimentally that VCSG can be deployed to improve convergence.

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Artificial Intelligence

2102.09893

Country:

Europe (0.93)
North America > United States > New York (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

Bi, Jia, Gunn, Steve R.

arXiv.org Machine LearningMay-13-2019

A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory shows these optimization methods can converge by using an unbiased gradient estimator. However, in practice biased gradient estimation can allow more efficient convergence to the vicinity since an unbiased approach is computationally more expensive. To produce fast convergence there are two trade-offs of these optimization strategies which are between stochastic/batch, and between biased/unbiased. This paper proposes an integrated approach which can control the nature of the stochastic element in the optimizer and can balance the trade-off of estimator between the biased and unbiased by using a hyper-parameter. It is shown theoretically and experimentally that this hyper-parameter can be configured to provide an effective balance to improve the convergence rate.

deep learning, learning rate, neural network, (20 more...)

arXiv.org Machine Learning

1905.05185

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Ghosh, Shaona (University of Southampton,UK) | Lovell, Chris (University of Southampton) | Gunn, Steve R. (University of Southampton)

AAAI ConferencesMar-21-2013

In many real-life design problems, there is a requirement to simultaneously balance multiple tasks or objectives in the system that are conflicting in nature, where minimizing one objective causes another to increase in value, thereby resulting in trade-offs between the objectives. For example, in embedded multi-core mobile devices and very large scale data centers, there is a continuous problem of simultaneously balancing interfering goals of maximal power savings and minimal performance delay with varying trade-off values for different application workloads executing on them. Typically, the optimal trade-offs for the executing workloads, lie on a difficult to determine optimal Pareto front. The nature of the problem requires learning over the lifetime of the mobile device or server with continuous evaluation and prediction of the trade-off settings on the system that balances the interfering objectives optimally. Towards this, we propose an on-line learning method, where the weights of experts for addressing the objectives are updated based on a convex combination of their relative performance in addressing all objectives simultaneously. An additional importance vector that assigns relative importance to each objective at every round is used, and is sampled from a convex cone pointed at the origin Our preliminary results show that the convex combination of the importance vector and the gradient of the potential functions of the learner's regret with respect to each objective ensure that in the next round, the drift (instantaneous regret vector), is the Pareto descent direction that enables better convergence to the optimal Pareto front.

objective, teaching medhods, teaching method, (9 more...)

AAAI Conferences

2013 AAAI Spring Symposium Series

Genre: Instructional Material > Online (0.64)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.73)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.64)

Add feedback

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Ghosh, Shaona (University of Southampton,UK) | Lovell, Chris (University of Southampton) | Gunn, Steve R. (University of Southampton)

AAAI ConferencesMar-21-2013

In many real-life design problems, there is a requirement to simultaneously balance multiple tasks or objectives in the system that are conflicting in nature, where minimizing one objective causes another to increase in value, thereby resulting in trade-offs between the objectives. For example, in embedded multi-core mobile devices and very large scale data centers, there is a continuous problem of simultaneously balancing interfering goals of maximal power savings and minimal performance delay with varying trade-off values for different application workloads executing on them. Typically, the optimal trade-offs for the executing workloads, lie on a difficult to determine optimal Pareto front. The nature of the problem requires learning over the lifetime of the mobile device or server with continuous evaluation and prediction of the trade-off settings on the system that balances the interfering objectives optimally. Towards this, we propose an on-line learning method, where the weights of experts for addressing the objectives are updated based on a convex combination of their relative performance in addressing all objectives simultaneously. An additional importance vector that assigns relative importance to each objective at every round is used, and is sampled from a convex cone pointed at the origin Our preliminary results show that the convex combination of the importance vector and the gradient of the potential functions of the learner's regret with respect to each objective ensure that in the next round, the drift (instantaneous regret vector), is the Pareto descent direction that enables better convergence to the optimal Pareto front.

on-line learning paradigm, pareto descent direction, sampling expert, (1 more...)

AAAI Conferences

2013 AAAI Spring Symposium Series

Genre: Instructional Material > Online (0.60)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.60)

Add feedback

Filters

Collaborating Authors

Gunn, Steve R.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization

A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm