AITopics | Backpropagation

Collaborating Authors

Backpropagation

News Overviews Instructional Materials AI-Alerts Classics

Training Deep Gaussian Processes using Stochastic Expectation Propagation and Probabilistic Backpropagation

Bui, Thang D., Hernández-Lobato, José Miguel, Li, Yingzhen, Hernández-Lobato, Daniel, Turner, Richard E.

arXiv.org Machine LearningNov-11-2015

Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are probabilistic and non-parametric and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. The focus of this paper is scalable approximate Bayesian learning of these networks. The paper develops a novel and efficient extension of probabilistic backpropagation, a state-of-the-art method for training Bayesian neural networks, that can be used to train DGPs. The new method leverages a recently proposed method for scaling Expectation Propagation, called stochastic Expectation Propagation. The method is able to automatically discover useful input warping, expansion or compression, and it is therefore is a flexible form of Bayesian kernel design. We demonstrate the success of the new method for supervised learning on several real-world datasets, showing that it typically outperforms GP regression and is never much worse.

artificial intelligence, gaussian process, machine learning, (13 more...)

arXiv.org Machine Learning

1511.03405

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Massachusetts (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)

Add feedback

Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks

Hernández-Lobato, José Miguel, Adams, Ryan P.

arXiv.org Machine LearningJul-15-2015

Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.

approximation, bayesian inference, neural network, (18 more...)

arXiv.org Machine Learning

1502.05336

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights

Soudry, Daniel, Hubara, Itay, Meir, Ron

Neural Information Processing SystemsDec-31-2014

Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a “mean-field” factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBP-trained MNNs with binary weights usually perform better than MNNs with continuous (real) weights - if we average the MNN output using the inferred posterior.

artificial intelligence, machine learning, mnn, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England (0.46)
North America (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.81)

Add feedback

Missing Value Imputation With Unsupervised Backpropagation

Gashler, Michael S., Smith, Michael R., Morris, Richard, Martinez, Tony

arXiv.org Machine LearningDec-18-2013

Unfortunately, real-world datasets often include only samples of observed values mixed with many missing or unknown elements. Missing values may occur due to human impatience, human error during data entry, data loss, faulty sensory equipment, changes in data collection methods, inability to decipher handwriting, privacy issues, legal requirements, and a variety of other practical factors. Thus, improvements to methods for imputing missing values can have far-reaching impact on improving the effectiveness of existing learning algorithms for operating on real-world data. We present a method for imputation called Unsupervised Backpropagation (UBP), which trains a multilayer perceptron (MLP) to fit to the manifold represented by the known features in a dataset. We demonstrate this algorithm with the task of imputing missing values, and we show that it is significantly more effective than other methods for imputation. Backpropagation has long been a popular method for training neural networks (Rumelhart et al., 1986; Werbos, 1990).

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1312.5394

Country:

North America > United States > Arkansas > Washington County > Fayetteville (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Utah > Utah County > Provo (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Genre: Research Report (1.00)

Industry:

Law (0.54)
Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.84)

Add feedback

Training a Feed-forward Neural Network with Artificial Bee Colony Based Backpropagation Method

Nandy, Sudarshan, Sarkar, Partha Pratim, Das, Achintya

arXiv.org Artificial IntelligenceSep-12-2012

Back-propagation algorithm is one of the most widely used and popular techniques to optimize the feed forward neural network training. Nature inspired meta-heuristic algorithms also provide derivative-free solution to optimize complex problem. Artificial bee colony algorithm is a nature inspired meta-heuristic algorithm, mimicking the foraging or food source searching behaviour of bees in a bee colony and this algorithm is implemented in several applications for an improved optimized outcome. The proposed method in this paper includes an improved artificial bee colony algorithm based back-propagation neural network training method for fast and improved convergence rate of the hybrid neural network learning method. The result is analysed with the genetic algorithm based back-propagation method, and it is another hybridized procedure of its kind. Analysis is performed over standard data sets, reflecting the light of efficiency of proposed method in terms of convergence speed and rate.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.5121/ijcsit.2012.4404

1209.2548

Country:

Asia > India > West Bengal > Kolkata (0.04)
North America > United States > Montana (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Food & Agriculture > Agriculture (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Add feedback

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

Caruana, Rich, Lawrence, Steve, Giles, C. Lee

Neural Information Processing SystemsDec-31-2001

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments suggest tworeasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity.

artificial intelligence, generalization, neural network, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Energy > Oil & Gas (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Add feedback

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

Caruana, Rich, Lawrence, Steve, Giles, C. Lee

Neural Information Processing SystemsDec-31-2001

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments suggest two reasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity.

artificial intelligence, generalization, neural network, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Energy > Oil & Gas (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Add feedback

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

Caruana, Rich, Lawrence, Steve, Giles, C. Lee

Neural Information Processing SystemsDec-31-2001

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments suggest two reasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity.

artificial intelligence, generalization, neural network, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Energy > Oil & Gas (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Add feedback

SPERT-II: A Vector Microprocessor System and its Application to Large Problems in Backpropagation Training

Wawrzynek, John, Asanovic, Krste, Kingsbury, Brian, Beck, James, Johnson, David, Morgan, Nelson

Neural Information Processing SystemsDec-31-1996

We report on our development of a high-performance system for neural network and other signal processing applications. We have designed and implemented a vector microprocessor and packaged it as an attached processor for a conventional workstation.

instruction, opération, workstation, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.42)

Add feedback

Tempering Backpropagation Networks: Not All Weights are Created Equal

Schraudolph, Nicol N., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1996

Backpropagation learning algorithms typically collapse the network's structure into a single vector of weight parameters to be optimized. We suggest that their performance may be improved by utilizing the structural information instead of discarding it, and introduce a framework for ''tempering'' each weight accordingly. In the tempering model, activation and error signals are treated as approximately independent random variables. The characteristic scale of weight changes is then matched to that ofthe residuals, allowing structural properties such as a node's fan-in and fan-out to affect the local learning rate and backpropagated error. The model also permits calculation of an upper bound on the global learning rate for batch updates, which in turn leads to different update rules for bias vs. non-bias weights. This approach yields hitherto unparalleled performance on the family relations benchmark, a deep multi-layer network: for both batch learning with momentum and the delta-bar-delta algorithm, convergence at the optimal learning rate is sped up by more than an order of magnitude.

global learning rate, learning rate, tempering backpropagation network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Denver County > Denver (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.65)

Add feedback