Dropout Regularization in Hierarchical Mixture of Experts

Dec-25-2018–arXiv.org Machine Learning

Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST and CIFAR-10 datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits.

dropout, dropout rate, neural network, (15 more...)

arXiv.org Machine Learning

Dec-25-2018

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > New York
    - New York County > New York City (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe > Middle East
  - Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East
  - Jordan (0.05)
  - Republic of Türkiye > Istanbul Province
    - Istanbul (0.04)

Genre:
- Research Report (0.52)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Neural Networks
    - Perceptrons (0.55)
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found