Backpropagation
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Switzerland > Bern > Bern (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
An In-depth Study of Stochastic Backpropagation
In particular, we discuss the following: Section 8.1 derives the gradient calculation for attention layers. Section 8.4 investigates the insights on the gradient keep-ratios and gradient keep masks on Section 8.6 compares the model similarity between with and without applying SBP . In section 3.2, we provide the gradient calculation of linear layers (or PW-Conv) and general convolutional layers for the backward phase of SBP . Eq. (18) and it will be an approximated version of its original case as well. The MLP sub-block is equivalent to two PW-Conv or linear layers.
Convergence and Alignment of Gradient Descent with Random Backpropagation Weights Ganlin Song Ruitu Xu John Lafferty Department of Statistics and Data Science
Stochastic gradient descent with backpropagation is the workhorse of artificial neural networks. It has long been recognized that backpropagation fails to be a biologically plausible algorithm. Fundamentally, it is a non-local procedure-- updating one neuron's synaptic weights requires knowledge of synaptic weights or receptive fields of downstream neurons. This limits the use of artificial neural networks as a tool for understanding the biological principles of information processing in the brain. Lillicrap et al. (2016) propose a more biologically plausible "feedback alignment" algorithm that uses random and fixed backpropagation weights, and show promising simulations. In this paper we study the mathematical properties of the feedback alignment procedure by analyzing convergence and alignment for two-layer networks under squared error loss. In the overparameter-ized setting, we prove that the error converges to zero exponentially fast, and also that regularization is necessary in order for the parameters to become aligned with the random backpropagation weights. Simulations are given that are consistent with this analysis and suggest further generalizations. These results contribute to our understanding of how biologically plausible algorithms might carry out weight learning in a manner different from Hebbian learning, with performance that is comparable with the full non-local backpropagation algorithm.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.42)
Parallel Backpropagation for Shared-Feature Visualization
High-level visual brain regions contain subareas in which neurons appear to respond more strongly to examples of a particular semantic category, like faces or bodies, rather than objects. However, recent work has shown that while this finding holds on average, some out-of-category stimuli also activate neurons in these regions.
- North America > United States (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.42)
Bridging Discrete and Backpropagation: Straight-Through and Beyond Liyuan Liu Chengyu Dong Xiaodong Liu Bin Y u Jianfeng Gao Microsoft Research
Backpropagation, the cornerstone of deep learning, is limited to computing gradients for continuous variables. This limitation poses challenges for problems involving discrete latent variables. To address this issue, we propose a novel approach to approximate the gradient of parameters involved in generating discrete latent variables. First, we examine the widely used Straight-Through (ST) heuristic and demonstrate that it works as a first-order approximation of the gradient. Guided by our findings, we propose ReinMax, which achieves second-order accuracy by integrating Heun's method, a second-order numerical method for solving ODEs. ReinMax does not require Hessian or other second-order derivatives, thus having negligible computation overheads. Extensive experimental results on various tasks demonstrate the superiority of ReinMax over the state of the art.
- Research Report > New Finding (0.48)
- Research Report > Promising Solution (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.61)
Supplementary Material GAIT-prop: A biologically plausible learning rule derived from backpropagation of error
The GAIT -prop and ITP targets are implemented as a weak perturbation of the forward pass. The table below presents the relevant parameters.Parameter V alue Learning Rate of Adam Optimiser {10 The results report peak and final (end of training) accuracy on the training set (organise'peak / final'). Parameters shown in bold were chosen and used for all results presented in the main paper. We find that target propagation often does best when early-stopping is implemented to'catch' this peak, unlike the other two algorithms which have asymptotic In the main paper, we showed that GAIT -propagation produces networks with final training/test accuracies which are indistinguishable from those produced by backpropagation of error. The performance of deep multi-layer perceptrons trained by BP, and GAIT -prop.
- North America > Canada (0.05)
- Europe > Netherlands > Gelderland > Nijmegen (0.05)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)