Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model
–Neural Information Processing Systems
This paper studies two-layers Neural Networks (NN), where the first layer contains random weights, and the second layer is trained using Ridge regularization. This model has been the focus of numerous recent works, showing that despite its simplicity, it captures some of the empirically observed behaviors of NN in the overparametrized regime, such as the double-descent curve where the generalization error decreases as the number of weights increases to \infty . This paper establishes asymptotic distribution results for this 2-layers NN model in the regime where the ratios \frac p n and \frac d n have finite limits, where n is the sample size, p the ambient dimension and d is the width of the first layer. We show that a weighted average of the derivatives of the trained NN at the observed data is asymptotically normal, in a setting with Lipschitz activation functions in a linear regression response with Gaussian features under possibly non-linear perturbations. We then leverage this asymptotic normality result to construct confidence intervals (CIs) for single components of the unknown regression vector.
Neural Information Processing Systems
Oct-11-2024, 12:18:21 GMT