ReLU activated Multi-Layer Neural Networks trained with Mixed Integer Linear Programs

Goebbels, Steffen

arXiv.org Machine Learning 

Neural Networks typically learn by adjusting weights via nonlinear optimization in a training phase. Often, variants of gradient descent are used. These techniques require some differentiability. Therefore, non-smooth but piecewise linear activation functions like ReLU or the Heaviside function raise the question if techniques of linear and mixed integer linear programming are also suited for network training. Learning to near optimality can be performed with Linear Programs (LP) of exponential size for certain network architectures, see [2].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found