Hey guys, I have written my first blog post on how to plot decision boundaries for classification models. Plotting decision boundaries can help immensely in the model selection and hyperparameter tuning process, as it can help detect overfitting or underfitting. I noticed that many online explanations and tutorials explaining overfitting rely on 2D datasets (i.e. Such boundaries can be easily plotted on a 2D plane, but what if your data contains six input features? One can use dimensionality reduction to plot the actual points, but what about the decision boundaries themselves?

Yi, Kexin, Doshi-Velez, Finale

We propose a new framework for Hamiltonian Monte Carlo (HMC) on truncated probability distributions with smooth underlying density functions. Traditional HMC requires computing the gradient of potential function associated with the target distribution, and therefore does not perform its full power on truncated distributions due to lack of continuity and differentiability. In our framework, we introduce a sharp sigmoid factor in the density function to approximate the probability drop at the truncation boundary. The target potential function is approximated by a new potential which smoothly extends to the entire sample space. HMC is then performed on the approximate potential. While our method is easy to implement and applies to a wide range of problems, it also achieves comparable computational efficiency on various sampling tasks compared to other baseline methods. RBHMC also gives rise to a new approach for Bayesian inference on constrained spaces.

The analysis is borrowed from the regression setting and aims to decompose the prediction error of a given classifier into the terms of B&V to evaluate their effects on the performance. Therefore, it can help answer questions such as "How can we compare the accuracy of two different types of classifiers?", "What is it that makes stronger classifiers perform well? Is it the reduction in the bias they bring about, or in variance, or both?". Other than being theoretically interesting, the answers to these questions are also meant to provide better classifier design strategies which bring about improved prediction performance. After the initial decomposition of the prediction error into the standard B&V terms in the regression setting by [1], different studies have attempted to carry over this analysis into the classification setting while preserving the meanings of the terms and the additive property of the decomposition.

Guénais, Théo, Vamvourellis, Dimitris, Yacoby, Yaniv, Doshi-Velez, Finale, Pan, Weiwei

Traditional training of deep classifiers yields overconfident models that are not reliable under dataset shift. We propose a Bayesian framework to obtain reliable uncertainty estimates for deep classifiers. Our approach consists of a plug-in "generator" used to augment the data with an additional class of points that lie on the boundary of the training data, followed by Bayesian inference on top of features that are trained to distinguish these "out-of-distribution" points.

Petersen, Philipp, Voigtlaender, Felix

We study the problem of learning classification functions from noiseless training samples, under the assumption that the decision boundary is of a certain regularity. We establish universal lower bounds for this estimation problem, for general classes of continuous decision boundaries. For the class of locally Barron-regular decision boundaries, we find that the optimal estimation rates are essentially independent of the underlying dimension and can be realized by empirical risk minimization methods over a suitable class of deep neural networks. These results are based on novel estimates of the $L^1$ and $L^\infty$ entropies of the class of Barron-regular functions.