Understanding the Under-Coverage Bias in Uncertainty Estimation

Neural Information Processing Systems 

Estimating the data uncertainty in regression tasks is often done by learning a quantile function or a prediction interval of the true label conditioned on the input. It is frequently observed that quantile regression---a vanilla algorithm for learning quantiles with asymptotic guarantees---tends to *under-cover* than the desired coverage level in reality. While various fixes have been proposed, a more fundamental understanding of why this under-coverage bias happens in the first place remains elusive.In this paper, we present a rigorous theoretical study on the coverage of uncertainty estimation algorithms in learning quantiles. We prove that quantile regression suffers from an inherent under-coverage bias, in a vanilla setting where we learn a realizable linear quantile function and there is more data than parameters. More quantitatively, for \alpha 0.5 and small d/n, the \alpha -quantile learned by quantile regression roughly achieves coverage \alpha - (\alpha-1/2)\cdot d/n regardless of the noise distribution, where d is the input dimension and n is the number of training data.