AITopics | Backpropagation

Collaborating Authors

Backpropagation

News Overviews Instructional Materials AI-Alerts Classics

Hoo Optimality Criteria for LMS and Backpropagation

Neural Information Processing SystemsApr-6-2023, 18:53:10 GMT

We have recently shown that the widely known LMS algorithm is an H OO optimal estimator. The H OO criterion has been introduced, initially in the control theory literature, as a means to ensure ro(cid:173) bust performance in the face of model uncertainties and lack of statistical information on the exogenous signals. We extend here our analysis to the nonlinear setting often encountered in neural networks, and show that the backpropagation algorithm is locally H OO optimal. This fact provides a theoretical justification of the widely observed excellent robustness properties of the LMS and backpropagation algorithms. We further discuss some implications of these results.

backpropagation algorithm, hoo optimality criteria, lms and backpropagation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.96)

Add feedback

Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization

Neural Information Processing SystemsApr-6-2023, 18:51:27 GMT

Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error func(cid:173) tion. The results presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay.

cid, deterministic nonmonotone perturbed minimization

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.49)

Add feedback

Backpropagation without Multiplication

Neural Information Processing SystemsApr-6-2023, 18:47:46 GMT

The back propagation algorithm has been modified to work with(cid:173) out any multiplications and to tolerate comput.ations Numbers are represented in float.ing In this way, all the computations can be executed with shift and add operations. An estimate of a circuit implementatioll shows that a large network can be placed on a single chip, reaching more t.han 1 billion weight updat.es A speedup is also obtained on any machine where a mul(cid:173) tiplication is slower than a shift operat.ioJl.

backpropagation, bit mantissa, multiplication, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

A Lagrangian Formulation For Optical Backpropagation Training In Kerr-Type Optical Networks

Neural Information Processing SystemsApr-6-2023, 18:43:52 GMT

A training method based on a form of continuous spatially distributed optical error back-propagation is presented for an all optical network composed of nondiscrete neurons and weighted interconnections. The all optical network is feed-forward and is composed of thin layers of a Kerr(cid:173) type self focusing/defocusing nonlinear optical material. The training method is derived from a Lagrangian formulation of the constrained minimization of the network error at the output. This leads to a formulation that describes training as a calculation of the distributed error of the optical signal at the output which is then reflected back through the device to assign a spatially distributed error to the internal layers. This error is then used to modify the internal weighting values.

kerr-type optical network, optical backpropagation training, refraction, (10 more...)

Neural Information Processing Systems

Industry: Telecommunications > Networks (0.87)

Technology:

Information Technology > Communications > Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

Learning Many Related Tasks at the Same Time with Backpropagation

Neural Information Processing SystemsApr-6-2023, 18:33:24 GMT

Hinton [6] proposed that generalization in artificial neural nets should improve if nets learn to represent the domain's underlying regularities. Abu-Mustafa's hints work [1] shows that the outputs of a backprop net can be used as inputs through which domain(cid:173) specific information can be given to the net. We extend these ideas by showing that a backprop net learning many related tasks at the same time can use these tasks as inductive bias for each other and thus learn better. We identify five mechanisms by which multitask backprop improves generalization and give empirical evidence that multi task backprop generalizes better in real domains.

backpropagation, related task

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

SPERT-II: A Vector Microprocessor System and its Application to Large Problems in Backpropagation Training

Neural Information Processing SystemsApr-6-2023, 18:21:21 GMT

We report on our development of a high-performance system for neural network and other signal processing applications. We have designed and implemented a vector microprocessor and pack(cid:173) aged it as an attached processor for a conventional workstation. The SPERT-II system demonstrates significant speedups over extensively hand(cid:173) optimization code running on the workstations.

application, backpropagation training, vector microprocessor system, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.50)

Add feedback

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

Neural Information Processing SystemsApr-6-2023, 16:54:15 GMT

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments sug(cid:173) gest two reasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity. Big nets pass through stages similar to those learned by smaller nets.

backpropagation, conjugate gradient, overfitting, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

Predictive Coding as a Neuromorphic Alternative to Backpropagation: A Critical Evaluation

Zahid, Umais, Guo, Qinghai, Fountas, Zafeirios

arXiv.org Artificial IntelligenceApr-5-2023

Backpropagation has rapidly become the workhorse credit assignment algorithm for modern deep learning methods. Recently, modified forms of predictive coding (PC), an algorithm with origins in computational neuroscience, have been shown to result in approximately or exactly equal parameter updates to those under backpropagation. Due to this connection, it has been suggested that PC can act as an alternative to backpropagation with desirable properties that may facilitate implementation in neuromorphic systems. Here, we explore these claims using the different contemporary PC variants proposed in the literature. We obtain time complexity bounds for these PC variants which we show are lower-bounded by backpropagation. We also present key properties of these variants that have implications for neurobiological plausibility and their interpretations, particularly from the perspective of standard PC as a variational Bayes algorithm for latent probabilistic models. Our findings shed new light on the connection between the two learning frameworks and suggest that, in its current forms, PC may have more limited potential as a direct replacement of backpropagation than previously envisioned.

artificial intelligence, backpropagation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.02658

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Back Propagation. Backpropagation is a popular algorithm…

#artificialintelligenceMar-21-2023, 09:41:09 GMT

Backpropagation is a popular algorithm used for training neural networks. Here, X is the input data, y is the corresponding output data, hidden_layer_size is the number of neurons in the hidden layer, learning_rate is the learning rate, and num_iterations is the number of iterations to train the model for. The sigmoid() function computes the sigmoid activation function. Here, we define the sigmoid activation function, which takes in an input value x and returns the output of the sigmoid function. Next, we define the derivative of the sigmoid function, which takes in an input value x and returns the derivative of the sigmoid function with respect to x.

backpropagation, neural network, sigmoid function, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.73)

Add feedback

Backpropagation through Combinatorial Algorithms: Identity with Projection Works

Sahoo, Subham Sekhar, Paulus, Anselm, Vlastelica, Marin, Musil, Vít, Kuleshov, Volodymyr, Martius, Georg

arXiv.org Artificial IntelligenceMar-17-2023

Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.

artificial intelligence, combinatorial algorithm, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2205.15213

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.60)

Add feedback