Scalable Classifiers with ADMM and Transpose Reduction

Taylor, Gavin (United States Naval Academy) | Xu, Zheng (University of Maryland) | Goldstein, Tom (University of Maryland)

Feb-4-2017–AAAI Conferences

As datasets for machine learning grow larger, parallelization strategies become more and more important. Recent approaches to distributed modelfitting rely heavily either on consensus ADMM, where each node solves smallsub-problems using only local data, or on stochastic gradient methods thatdon't scale well to large numbers of cores in a cluster setting. For this reason, GPU clusters have become common prerequisites to large-scale machinelearning. This paper describes an unconventional training method that uses alternating direction methods and Bregman iteration to train a variety of machine learning models on CPUs while avoiding the drawbacks of consensus methods and without gradient descent steps. Using transpose reduction strategies, the proposed method reduces the optimization problems to a sequence of minimization sub-steps that can each be solved globally in closed form. The method provides strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.

admm, deep learning, neural network, (20 more...)

AAAI Conferences

Feb-4-2017

Conferences PDF

Add feedback

Country:
- North America > United States > Maryland (0.28)

Genre:
- Research Report (0.48)

Industry:
- Government > Military (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.47)
    - Statistical Learning > Gradient Descent (0.68)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found