An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks

Feb-20-2020–arXiv.org Machine Learning

Deep networks have achieved remarkable performance in several application domains such as computer vision, natural language processing and genomics (Krizhevsky et al. 2012, Pennington et al. 2014, Duvenaud et al. 2015). A deep network can be framed as a chain of composition of modules, where each module is typically the composition of a nonlinear function and an affine transformation. The last module in the chain is usually task-specific and can be expressed either in analytical form as in supervised classification or as the solution of an optimization problem in dimension reduction or clustering. The optimization problem arising when training a deep network is often framed as a non-convex optimization problem, dismissing the structure of the objective yet central to the software implementation. Indeed optimization algorithms used to train deep networks proceed by making calls to first-order (or second-order) oracles relying on dynamic programming such as gradient back-propagation (Werbos 1994, Rumelhart et al. 1986, Lecun 1988).

complexity, computation, opération, (15 more...)

arXiv.org Machine Learning

Feb-20-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Washington > King County
    - Seattle (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found