Regularization Learning Networks: Deep Learning for Tabular Datasets

Ira Shavitt, Eran Segal

Neural Information Processing Systems 

Despite their impressive performance, Deep Neural Networks (DNNs) typically underperform Gradient Boosting Trees (GBTs) on many tabular-dataset learning tasks. W e propose that applying a different regularization coefficient to each weight might boost the performance of DNNs by allowing them t o make more use of the more relevant inputs. However, this will lead to an int ractable number of hyperparameters. Here, we introduce Regularization Learning Networks (RLNs), which overcome this challenge by introducing an efficient hy perparameter tuning scheme which minimizes a new Counterfactual Loss . Our results show that RLNs significantly improve DNNs on tabular datasets, and achieve comparable results to GBTs, with the best performance achieved with an ensemble that combines GBTs and RLNs. RLNs produce extremely sparse networks, elim inating up to 99 .