AITopics | Taylor, Gavin

Visualizing the Loss Landscape of Neural Nets

Li, Hao, Xu, Zheng, Taylor, Gavin, Studer, Christoph, Goldstein, Tom

arXiv.org Machine LearningMar-5-2018

Neural network training relies on our ability to find "good" minimizers of highly non-convex loss functions. It is well known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effect on the underlying loss landscape, is not well understood. In this paper, we explore the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods. First, we introduce a simple "filter normalization" method that helps us visualize loss function curvature, and make meaningful side-by-side comp arisons between loss functions. Then, using a variety of visualizations, we explore how network architecture affects the loss landscape, and how training parameters affect the shape of minimizers.

deep learning, loss 3 2 60 40, neural network, (20 more...)

arXiv.org Machine Learning

1712.09913

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Variance Reduction for Distributed Stochastic Gradient Descent

De, Soham, Taylor, Gavin, Goldstein, Tom

arXiv.org Machine LearningApr-7-2017

Variance reduction (VR) methods boost the performance of stochastic gradient descent (SGD) by enabling the use of larger, constant stepsizes and preserving linear convergence rates. However, current variance reduced SGD methods require either high memory usage or an exact gradient computation (using the entire dataset) at the end of each epoch. This limits the use of VR methods in practical distributed settings. In this paper, we propose a variance reduction method, called VR-lite, that does not require full gradient computations or extra storage. We explore distributed synchronous and asynchronous variants that are scalable and remain stable with low communication frequency. We empirically compare both the sequential and distributed algorithms to state-of-the-art stochastic optimization methods, and find that our proposed algorithms perform favorably to other stochastic methods.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1512.01708

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Scalable Classifiers with ADMM and Transpose Reduction

Taylor, Gavin (United States Naval Academy) | Xu, Zheng (University of Maryland) | Goldstein, Tom (University of Maryland)

AAAI ConferencesFeb-4-2017

As datasets for machine learning grow larger, parallelization strategies become more and more important. Recent approaches to distributed modelfitting rely heavily either on consensus ADMM, where each node solves smallsub-problems using only local data, or on stochastic gradient methods thatdon't scale well to large numbers of cores in a cluster setting. For this reason, GPU clusters have become common prerequisites to large-scale machinelearning. This paper describes an unconventional training method that uses alternating direction methods and Bregman iteration to train a variety of machine learning models on CPUs while avoiding the drawbacks of consensus methods and without gradient descent steps. Using transpose reduction strategies, the proposed method reduces the optimization problems to a sequence of minimization sub-steps that can each be solved globally in closed form. The method provides strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.

admm, deep learning, neural network, (20 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.48)

Industry: Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Introduction to the Symposium on AI and the Mitigation of Human Error

Mittu, Ranjeev (Naval Research Laboratory) | Taylor, Gavin (US Naval Academy) | Sofge, Don (Naval Research Laboratory) | Lawless, W. F. (Paine College)

AAAI ConferencesMar-16-2016

However, foundational problems remain in the either mindfully or inadvertently by individuals or teams of continuing development of AI for team autonomy, humans. One worry about this bright future is that jobs especially with objective measures able to optimize team may be lost; from Mims (2015), function, performance and composition. Something potentially momentous is happening inside AI approaches often attempt to address autonomy by startups, and it's a practice that many of their established modeling aspects of human decision-making or behavior.

air transportation, human error, upstream oil & gas, (18 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country: North America > United States > New York > New York County > New York City (0.14)

Industry:

Transportation > Air (1.00)
Health & Medicine (1.00)
Energy (0.95)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Reports on the 2015 AAAI Spring Symposium Series

Agarwal, Nitin (University of Arkansas at Little Rock) | Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft Research) | Fang, Fei (University of Southern California) | Fenstermacher, Laurie (Wright-Patterson Air Force Base) | Kagal, Lalana (Massachusetts Institute of Technology) | Kido, Takashi (Rikengenesis) | Kiekintveld, Christopher (University of Texas at El Paso) | Lawless, W. F. (Paine College) | Liu, Huan (Arizona State University) | McCallum, Andrew (University of Massachusetts) | Purohit, Hemant (Wright State University) | Seneviratne, Oshani (Massachusetts Institute of Technology) | Takadama, Keiki (University of Electro-Communications) | Taylor, Gavin (US Naval Academy)

AI MagazineSep-28-2015

The AAAI 2015 Spring Symposium Series was held Monday through Wednesday, March 23-25, at Stanford University near Palo Alto, California. The titles of the seven symposia were Ambient Intelligence for Health and Cognitive Enhancement, Applied Computational Game Theory, Foundations of Autonomy and Its (Cyber) Threats: From Individuals to Interdependence, Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Logical Formalizations of Commonsense Reasoning, Socio-Technical Behavior Mining: From Data to Decisions, Structured Data for Humanitarian Technologies: Perfect Fit or Overkill?

artificial intelligence, PROBLEM SOLVING, symposium, (3 more...)

AI Magazine

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.73)

Add feedback

Reports on the 2015 AAAI Spring Symposium Series

Agarwal, Nitin (University of Arkansas at Little Rock) | Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft Research) | Fang, Fei (University of Southern California) | Fenstermacher, Laurie (Wright-Patterson Air Force Base) | Kagal, Lalana (Massachusetts Institute of Technology) | Kido, Takashi (Rikengenesis) | Kiekintveld, Christopher (University of Texas at El Paso) | Lawless, W. F. (Paine College) | Liu, Huan (Arizona State University) | McCallum, Andrew (University of Massachusetts) | Purohit, Hemant (Wright State University) | Seneviratne, Oshani (Massachusetts Institute of Technology) | Takadama, Keiki (University of Electro-Communications) | Taylor, Gavin (US Naval Academy)

AI MagazineSep-28-2015

The AAAI 2015 Spring Symposium Series was held Monday through Wednesday, March 23-25, at Stanford University near Palo Alto, California. The titles of the seven symposia were Ambient Intelligence for Health and Cognitive Enhancement, Applied Computational Game Theory, Foundations of Autonomy and Its (Cyber) Threats: From Individuals to Interdependence, Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Logical Formalizations of Commonsense Reasoning, Socio-Technical Behavior Mining: From Data to Decisions, Structured Data for Humanitarian Technologies: Perfect Fit or Overkill? and Turn-Taking and Coordination in Human-Machine Interaction.The highlights of each symposium are presented in this report.

law enforcement, neural network, symposium, (27 more...)

AI Magazine

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.24)
North America > Canada > Ontario > Toronto (0.14)

Industry:

Media (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

Taylor, Gavin, Parr, Ron

arXiv.org Machine LearningOct-16-2012

Recently, Petrik et al. demonstrated that L1Regularized Approximate Linear Programming (RALP) could produce value functions and policies which compared favorably to established linear value function approximation techniques like LSPI. RALP's success primarily stems from the ability to solve the feature selection and value function approximation steps simultaneously. RALP's performance guarantees become looser if sampled next states are used. For very noisy domains, RALP requires an accurate model rather than samples, which can be unrealistic in some practical scenarios. In this paper, we demonstrate this weakness, and then introduce Locally Smoothed L1-Regularized Approximate Linear Programming (LS-RALP). We demonstrate that LS-RALP mitigates inaccuracies stemming from noise even without an accurate model. We show that, given some smoothness assumptions, as the number of samples increases, error from noise approaches zero, and provide experimental examples of LS-RALP's success on common reinforcement learning benchmark problems.

fuzzy logic, optimization problem, value function, (16 more...)

arXiv.org Machine Learning

1210.4898

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)

Add feedback

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes

Petrik, Marek, Taylor, Gavin, Parr, Ron, Zilberstein, Shlomo

arXiv.org Artificial IntelligenceMay-20-2010

Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using $L_1$ regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.

Add feedback

Filters

Collaborating Authors

Taylor, Gavin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Visualizing the Loss Landscape of Neural Nets

Variance Reduction for Distributed Stochastic Gradient Descent

Scalable Classifiers with ADMM and Transpose Reduction

Introduction to the Symposium on AI and the Mitigation of Human Error

Reports on the 2015 AAAI Spring Symposium Series

Reports on the 2015 AAAI Spring Symposium Series

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes