Joint Inference for Neural Network Depth and Dropout Regularization Kishan K C1 Rui Li1 Mahdi Gilany Rochester Institute of Technology 2
–Neural Information Processing Systems
Dropout regularization methods prune a neural network's pre-determined backbone structure to avoid overfitting. However, a deep model still tends to be poorly calibrated with high confidence on incorrect predictions. We propose a unified Bayesian model selection method to jointly infer the most plausible network depth warranted by data, and perform dropout regularization simultaneously. In particular, to infer network depth we define a beta process over the number of hidden layers which allows it to go to infinity. Layer-wise activation probabilities induced by the beta process modulate neuron activation via binary vectors of a conjugate Bernoulli process. Experiments across domains show that by adapting network depth and dropout regularization to data, our method achieves superior performance comparing to state-of-the-art methods with well-calibrated uncertainty estimates. In continual learning, our method enables neural networks to dynamically evolve their depths to accommodate incrementally available data beyond their initial structures, and alleviate catastrophic forgetting.
Neural Information Processing Systems
May-30-2025, 01:39:56 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- North America > United States (0.14)
- Asia > Middle East
- Genre:
- Research Report (0.88)