Goto

Collaborating Authors

 rumelhart


Comparing Biases for Minimal Network Construction with Back-Propagation

Neural Information Processing Systems

This approach can be used to (a) dynamically select the number of hidden units. The method Rumelhart suggests involves adding penalty terms to the usual error function. In this paper we introduce Rumelhart·s minimal networks idea and compare two possible biases on the weight search space. These biases are compared in both simple counting problems and a speech recognition problem.


AI: The pattern is not in the data, it's in the machine

#artificialintelligence

A neural network transforms input, the circles on the left, to output, on the right. How that happens is a transformation of weights, center, which we often confuse for patterns in the data itself. It's a commonplace of artificial intelligence to say that machine learning, which depends on vast amounts of data, functions by finding patterns in data. The phrase, "finding patterns in data," in fact, has been a staple phrase of things such as data mining and knowledge discovery for years now, and it has been assumed that machine learning, and its deep learning variant especially, are just continuing the tradition of finding such patterns. AI programs do, indeed, result in patterns, but, just as "The fault, dear Brutus, lies not in our stars but in ourselves," the fact of those patterns is not something in the data, it is what the AI program makes of the data.


AI: The pattern is not in the data, it's in the machine

#artificialintelligence

A neural network transforms input, the circles on the left, to output, on the right. How that happens is a transformation of weights, center, which we often confuse for patterns in the data itself. It's a commonplace of artificial intelligence to say that machine learning, which depends on vast amounts of data, functions by finding patterns in data. The phrase, "finding patterns in data," in fact, has been a staple phrase of things such as data mining and knowledge discovery for years now, and it has been assumed that machine learning, and its deep learning variant especially, are just continuing the tradition of finding such patterns. AI programs do, indeed, result in patterns, but, just as "The fault, dear Brutus, lies not in our stars but in ourselves," the fact of those patterns is not something in the data, it is what the AI program makes of the data.


Backpropagation

#artificialintelligence

In machine learning, backpropagation (backprop,[1] BP) is a widely used algorithm for training feedforward neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation".[2] In fitting a neural network, backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming.[3]


From classic AI techniques to Deep Reinforcement Learning

@machinelearnbot

Building machines that can learn from examples, experience, or even from another machines at human level are the main goal of solving AI. That goal in other words is to create a machine that pass the Turing test: when a human is interacting with it, for the human it will not possible to conclude if it he is interacting with a human or a machine [Turing, A.M 1950]. The fundamental algorithms of deep learning were developed in the middle of 20th century. Since them the field was developed as a theory branch of stochastic operations research and computer science, but without any breakthrough application. But, in the last 20 years the synergy between big data sets, specially labeled data, and augmentation of computer power using graphics processor units, those algorithms have been developed in more complex techniques, technologies and reasoning logics enable to achieve several goals as reducing word error rates in speech recognition; cutting the error rate in an image recognition competition [Krizhevsky et al 2012] and beating a human champion at Go [Silver et al 2016].


David Rumelhart Dies at 68; Created Computer Simulations of Perception

AITopics Original Links

David E. Rumelhart, whose computer simulations of perception gave scientists some of the first testable models of neural processing and proved helpful in the development of machine learning and artificial intelligence, died Sunday in Chelsea, Mich. The cause was complications of Pick's disease, an Alzheimer's-like disorder from which he had suffered for more than a decade, his son Karl said. When Dr. Rumelhart, a psychologist, began thinking in the 1960s about how neurons process information, the field was split into two camps that had little common language: biologists, who focused on neurons and brain tissue; and cognitive psychologists, who studied far more abstract processes, like reasoning skills and learning strategies. By starting small -- showing, for instance, that the brain's ability to recognize a single letter was greatly influenced by the letters around it -- Dr. Rumelhart and his colleague Jay McClelland, around 1980, built computer programs that roughly simulated perception. Later, he devised an algorithm that allowed computer programs to learn how to perceive.




Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization

Mangasarian, O. L., Solodov, M. V.

Neural Information Processing Systems

The fundamental backpropagation (BP) algorithm for training artificial neuralnetworks is cast as a deterministic nonmonotone perturbed gradientmethod. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error function. Theresults presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay. 1 INTRODUCTION


Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization

Mangasarian, O. L., Solodov, M. V.

Neural Information Processing Systems

The fundamental backpropagation (BP) algorithm for training artificial neural networks is cast as a deterministic nonmonotone perturbed gradient method. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error function. The results presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay. 1 INTRODUCTION