Goto

Collaborating Authors

 dropmax


DropMax: Adaptive Variational Softmax

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification.



DropMax: Adaptive Variational Softmax

Hae Beom Lee, Juho Lee, Saehoon Kim, Eunho Yang, Sung Ju Hwang

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance.


DropMax: Adaptive Variational Softmax

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification.


Reviews: DropMax: Adaptive Variational Softmax

Neural Information Processing Systems

This paper proposes doing dropout in the output softmax layer during supervised training of neural net classifiers. The dropout probabilities are adapted per example. The probabilities are computed as a function of the penultimate layer of the classifier. So that layer is used to compute both: the logits, and the gating for those logits. This model combines ideas from adaptive dropout (Ba and Frey NIPS'13) and variational dropout (Kingma et al). The key problem being solved is how to do inference to get the optimal dropout probabilities.


DropMax: Adaptive Variational Softmax

Lee, Hae Beom, Lee, Juho, Kim, Saehoon, Yang, Eunho, Hwang, Sung Ju

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines.


DropMax: Adaptive Variational Softmax

Lee, Hae Beom, Lee, Juho, Kim, Saehoon, Yang, Eunho, Hwang, Sung Ju

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification.


DropMax: Adaptive Variational Softmax

Lee, Hae Beom, Lee, Juho, Kim, Saehoon, Yang, Eunho, Hwang, Sung Ju

Neural Information Processing Systems

We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification.