AITopics | Gradient Descent

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.

algorithm, exemplar, learning rate schedule, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.38)

Add feedback

Leaning by Combining Memorization and Gradient Descent

Platt, John C.

Neural Information Processing SystemsDec-31-1991

We have created a radial basis function network that allocates a new computational unit whenever an unusual pattern is presented to the network. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which memorizes the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using back-propagation and uses a comparable number of synapses.

compact representation, gradient descent, representation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.77)

Add feedback

Note on Learning Rate Schedules for Stochastic Optimization

Darken, Christian, Moody, John E.

Neural Information Processing SystemsDec-31-1991

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation andk-means clustering as special cases. We introduce "search-thenconverge" typeschedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.

artificial intelligence, exemplar, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

Leaning by Combining Memorization and Gradient Descent

Platt, John C.

Neural Information Processing SystemsDec-31-1991

We have created a radial basis function network that allocates a new computational unit whenever an unusual pattern is presented to the network. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which memorizes the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using back-propagation and uses a comparable number of synapses.

artificial intelligence, gradient descent, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.77)

Add feedback

Optimization by Mean Field Annealing

Bilbro, Griff, Mann, Reinhold, Miller, Thomas K., Snyder, Wesley E., Bout, David E. van den, White, Mark

Neural Information Processing SystemsDec-31-1989

Nearly optimal solutions to many combinatorial problems can be found using stochastic simulated annealing. This paper extends the concept of simulated annealing from its original formulation as a Markov process to a new formulation based on mean field theory. Mean field annealing essentially replaces the discrete degrees of freedom in simulated annealing with their average values as computed by the mean field approximation. The net result is that equilibrium at a given temperature is achieved 1-2 orders of magnitude faster than with simulated annealing. A general framework for the mean field annealing algorithm is derived, and its relationship to Hopfield networks is shown. The behavior of MFA is examined both analytically and experimentally for a generic combinatorial optimization problem: graph bipartitioning. This analysis indicates the presence of critical temperatures which could be important in improving the performance of neural networks.

iteration, mean field, spin average, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.15)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Optimization by Mean Field Annealing

Bilbro, Griff, Mann, Reinhold, Miller, Thomas K., Snyder, Wesley E., Bout, David E. van den, White, Mark

Neural Information Processing SystemsDec-31-1989

Nearly optimal solutions to many combinatorial problems can be found using stochastic simulated annealing. This paper extends the concept of simulated annealing from its original formulation as a Markov process to a new formulation based on mean field theory. Mean field annealing essentially replaces the discrete degrees of freedom in simulated annealing with their average values as computed by the mean field approximation. The net result is that equilibrium at a given temperature is achieved 1-2 orders of magnitude faster than with simulated annealing. A general framework for the mean field annealing algorithm is derived, and its relationship to Hopfield networks is shown. The behavior of MFA is examined both analytically and experimentally for a generic combinatorial optimization problem: graph bipartitioning. This analysis indicates the presence of critical temperatures which could be important in improving the performance of neural networks.

iteration, mean field, spin average, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.15)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Optimization by Mean Field Annealing

Bilbro, Griff, Mann, Reinhold, Miller, Thomas K., Snyder, Wesley E., Bout, David E. van den, White, Mark

Neural Information Processing SystemsDec-31-1989

Nearly optimal solutions to many combinatorial problems can be found using stochastic simulated annealing. This paper extends the concept of simulated annealing from its original formulation as a Markov process to a new formulation based on mean field theory. Mean field annealing essentially replaces the discrete degrees offreedom in simulated annealing with their average values as computed by the mean field approximation. The net result is that equilibrium at a given temperature is achieved 1-2 orders of magnitude faster than with simulated annealing. A general framework forthe mean field annealing algorithm is derived, and its relationship toHopfield networks is shown. The behavior of MFA is examined both analytically and experimentally for a generic combinatorial optimizationproblem: graph bipartitioning. This analysis indicates the presence of critical temperatures which could be important inimproving the performance of neural networks.

artificial intelligence, machine learning, spin average, (18 more...)

Neural Information Processing Systems

Country: North America > United States > North Carolina (0.16)

Technology: