[1606.04474] Learning to learn by gradient descent by gradient descent • /r/MachineLearning

Jun-15-2016, 03:10:13 GMT–@machinelearnbot

One thing, which I'm not sure, is how correct is their comparison. By that I mean that they fix the global learning rate for the "hand designed" algos and choose it by grid search. However, we do know well that in most problems we can start with a larger learning rate an decay it over time after it platoes. The issue of not conisdering that probably the best global learning rate for the whole run, would be one which is very slow, but eventually outperforms faster ones. Nevertheless, this is an interesting work, although I'm still quite skeptical of such optimiziers to generalize well on large models.

artificial intelligence, gradient descent, machine learning, (2 more...)

@machinelearnbot

Jun-15-2016, 03:10:13 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found