AITopics | Baldi, Pierre

A Theory of Local Learning, the Learning Channel, and the Optimality of Backpropagation

arXiv.org Machine LearningOct-21-2016

In a physical neural system, where storage and processing are intimately intertwined, the rules for adjusting the synaptic weights can only depend on variables that are available locally, such as the activity of the pre- and post-synaptic neurons, resulting in local learning rules. A systematic framework for studying the space of local learning rules is obtained by first specifying the nature of the local variables, and then the functional form that ties them together into each learning rule. Such a framework enables also the systematic discovery of new learning rules and exploration of relationships between learning rules and group symmetries. We study polynomial local learning rules stratified by their degree and analyze their behavior and capabilities in both linear and non-linear units and networks. Stacking local learning rules in deep feedforward networks leads to deep local learning. While deep local learning can learn interesting representations, it cannot learn complex input-output functions, even when targets are available for the top layer. Learning complex input-output functions requires local deep learning where target information is communicated to the deep layers through a backward learning channel. The nature of the communicated information about the targets and the structure of the learning channel partition the space of learning algorithms. We estimate the learning channel capacity associated with several algorithms and show that backpropagation outperforms them by simultaneously maximizing the information rate and minimizing the computational cost, even in recurrent networks. The theory clarifies the concept of Hebbian learning, establishes the power and limitations of local learning rules, introduces the learning channel which enables a formal analysis of the optimality of backpropagation, and explains the sparsity of the space of learning rules discovered so far.

algorithm, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

doi: 10.1016/j.neunet.2016.07.006

1506.06472

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.40)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Activation Functions to Improve Deep Neural Networks

Agostinelli, Forest, Hoffman, Matthew, Sadowski, Peter, Baldi, Pierre

arXiv.org Machine LearningApr-21-2015

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.

activation function, deep learning, neural network, (13 more...)

arXiv.org Machine Learning

1412.683

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Searching for Higgs Boson Decay Modes with Deep Learning

Sadowski, Peter J., Whiteson, Daniel, Baldi, Pierre

Neural Information Processing SystemsDec-31-2014

Particle colliders enable us to probe the fundamental nature of matter by observing exotic particles produced by high-energy collisions. Because the experimental measurements from these collisions are necessarily incomplete and imprecise, machine learning algorithms play a major role in the analysis of experimental data. The high-energy physics community typically relies on standardized machine learning software packages for this analysis, and devotes substantial effort towards improving statistical power by hand crafting high-level features derived from the raw collider measurements. In this paper, we train artificial neural networks to detect the decay of the Higgs boson to tau leptons on a dataset of 82 million simulated collision events. We demonstrate that deep neural network architectures are particularly well-suited for this task with the ability to automatically discover high-level features from the data and increase discovery significance.

deep learning, high-level feature, neural network, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.15)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Dropout

Baldi, Pierre, Sadowski, Peter J.

Neural Information Processing SystemsDec-31-2013

Dropout is a relatively new algorithm for training neural networks which relies on stochastically dropping out'' neurons during training in order to avoid the co-adaptation of feature detectors. We introduce a general formalism for studying dropout on either units or connections, with arbitrary probability values, and use it to analyze the averaging and regularizing properties of dropout in both linear and non-linear networks. For deep neural networks, the averaging properties of dropout are characterized by three recursive equations, including the approximation of expectations by normalized weighted geometric means. We provide estimates and bounds for these approximations and corroborate the results with simulations. We also show in simple cases how dropout performs stochastic gradient descent on a regularized error function."

approximation, deep learning, neural network, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Pollastri, Gianluca, Baldi, Pierre, Vullo, Alessandro, Frasconi, Paolo

Neural Information Processing SystemsDec-31-2003

We develop and test new machine learning methods for the prediction of topological representations of protein structures in the form of coarse-or fine-grained contact or distance maps that are translation and rotation invariant. The methods are based on generalized input-output hidden Markov models (GIOHMMs) and generalized recursive neural networks (GRNNs). The methods are used to predict topology directly in the fine-grained case and, in the coarsegrained case, indirectly by first learning how to score candidate graphs and then using the scoring function to search the space of possible configurations. Computer simulations show that the predictors achieve state-of-the-art performance.

neural network, prediction, upstream oil & gas, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Pollastri, Gianluca, Baldi, Pierre, Vullo, Alessandro, Frasconi, Paolo

Neural Information Processing SystemsDec-31-2003

We develop and test new machine learning methods for the prediction of topological representations of protein structures in the form of coarse-or fine-grained contact or distance maps that are translation and rotation invariant. The methods are based on generalized input-output hidden Markov models (GIOHMMs) and generalized recursive neural networks (GRNNs). The methods are used to predict topology directly in the fine-grained case and, in the coarsegrained case, indirectly by first learning how to score candidate graphs and then using the scoring function to search the space of possible configurations. Computer simulations show that the predictors achieve state-of-the-art performance.

neural network, prediction, upstream oil & gas, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Pollastri, Gianluca, Baldi, Pierre, Vullo, Alessandro, Frasconi, Paolo

Neural Information Processing SystemsDec-31-2003

We develop and test new machine learning methods for the prediction oftopological representations of protein structures in the form of coarse-or fine-grained contact or distance maps that are translation androtation invariant. The methods are based on generalized input-output hidden Markov models (GIOHMMs) and generalized recursive neural networks (GRNNs). The methods are used to predict topologydirectly in the fine-grained case and, in the coarsegrained case,indirectly by first learning how to score candidate graphs and then using the scoring function to search the space of possible configurations. Computer simulations show that the predictors achievestate-of-the-art performance.

contact map, health & medicine, upstream oil & gas, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.94)

Add feedback

Universal Approximation and Learning of Trajectories Using Oscillators

Baldi, Pierre, Hornik, Kurt

Neural Information Processing SystemsDec-31-1996

Natural and artificial neural circuits must be capable of traversing specificstate space trajectories. A natural approach to this problem is to learn the relevant trajectories from examples. Unfortunately, gradientdescent learning of complex trajectories in amorphous networks is unsuccessful. We suggest a possible approach wheretrajectories are realized by combining simple oscillators, in various modular ways. We contrast two regimes of fast and slow oscillations.

artificial intelligence, neural network, trajectory, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Universal Approximation and Learning of Trajectories Using Oscillators

Baldi, Pierre, Hornik, Kurt

Neural Information Processing SystemsDec-31-1996

Natural and artificial neural circuits must be capable of traversing specific state space trajectories. A natural approach to this problem is to learn the relevant trajectories from examples. Unfortunately, gradient descent learning of complex trajectories in amorphous networks is unsuccessful. We suggest a possible approach where trajectories are realized by combining simple oscillators, in various modular ways. We contrast two regimes of fast and slow oscillations. In all cases, we show that banks of oscillators with bounded frequencies have universal approximation properties. Open questions are also discussed briefly.

artificial intelligence, neural network, trajectory, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Inferring Ground Truth from Subjective Labelling of Venus Images

Smyth, Padhraic, Fayyad, Usama M., Burl, Michael C., Perona, Pietro, Baldi, Pierre

Neural Information Processing SystemsDec-31-1995

In practical situations, experts may visually examine the images and provide a subjective noisy estimate of the truth. Calibrating the reliability and bias of expert labellers is a nontrivial problem. In this paper we discuss some of our recent work on this topic in the context of detecting small volcanoes in Magellan SAR images of Venus. Empirical results (using the Expectation-Maximization procedure) suggest that accounting for subjective noise can be quite significant interms of quantifying both human and algorithm detection performance.

Add feedback

Filters

Collaborating Authors

Baldi, Pierre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Theory of Local Learning, the Learning Channel, and the Optimality of Backpropagation

Learning Activation Functions to Improve Deep Neural Networks

Searching for Higgs Boson Decay Modes with Deep Learning

Understanding Dropout

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Prediction of Protein Topologies Using Generalized IOHMMs and RNNs

Universal Approximation and Learning of Trajectories Using Oscillators

Universal Approximation and Learning of Trajectories Using Oscillators

Inferring Ground Truth from Subjective Labelling of Venus Images