r/MachineLearning - [D] Gradient Descent on (deterministic) Mean Absolute Error (L1 loss)

Jan-21-2019, 14:28:35 GMT–#artificialintelligence

Gradient-based optimization of absolute errors is tricky, since the gradient is "never" zero. In theory, adaptive methods should be able to damp oscillations so that it converges to the minimum. However, I found none of the'standard' methods were able to do this "out of the box". Learning rate decay could alleviate the problem, but needs manual tuning which I would rather avoid. Does anyone know of a method that can do this?

artificial intelligence, machine learning, social media, (4 more...)

#artificialintelligence

Jan-21-2019, 14:28:35 GMT

News Web Page

Add feedback

Industry:
- Media > News (0.40)

Technology:
- Information Technology
  - Communications > Social Media (0.76)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning > Gradient Descent (0.40)