Gradient Descent
Appendix to: Training Uncertainty-Aware Classifiers with Conformalized Deep Learning Bat-Sheva Einbinder Y aniv Romano Matteo Sesia Y anfei Zhou A1 Additional methodological details
Authors listed in alphabetical order. Figure A1: Schematic of the proposed uncertainty-aware deep classification learning algorithm. This procedure is summarized in Algorithm A1, which is a more technical version of Algorithm 1. (t 1) (t 1) This section explains the implementation of the hybrid benchmark method applied in Section 4. This This benchmark is based on a loss function designed to incentivize the trained model to produce the smallest possible conformal prediction sets with the desired coverage (e.g., 90% if (t 1) (t 1) To facilitate the exposition of our analysis, we begin by introducing some helpful notations. The first part of the proof is standard and proceeds as follows. A3.1 Details about experiments with synthetic data The conditional data-generating distribution of Y given X is given by: P[Y | X ] = null Our method (resp., the hybrid method) is applied using The hybrid loss model is trained via stochastic gradient descent for 4000 epochs with learning rate 0.01 decreased by a factor 10 halfway through training.
Convergence and Alignment of Gradient Descent with Random Backpropagation Weights Ganlin Song Ruitu Xu John Lafferty Department of Statistics and Data Science
Stochastic gradient descent with backpropagation is the workhorse of artificial neural networks. It has long been recognized that backpropagation fails to be a biologically plausible algorithm. Fundamentally, it is a non-local procedure-- updating one neuron's synaptic weights requires knowledge of synaptic weights or receptive fields of downstream neurons. This limits the use of artificial neural networks as a tool for understanding the biological principles of information processing in the brain. Lillicrap et al. (2016) propose a more biologically plausible "feedback alignment" algorithm that uses random and fixed backpropagation weights, and show promising simulations. In this paper we study the mathematical properties of the feedback alignment procedure by analyzing convergence and alignment for two-layer networks under squared error loss. In the overparameter-ized setting, we prove that the error converges to zero exponentially fast, and also that regularization is necessary in order for the parameters to become aligned with the random backpropagation weights. Simulations are given that are consistent with this analysis and suggest further generalizations. These results contribute to our understanding of how biologically plausible algorithms might carry out weight learning in a manner different from Hebbian learning, with performance that is comparable with the full non-local backpropagation algorithm.