Moody, John E.
Fast Pruning Using Principal Components
Levin, Asriel U., Leen, Todd K., Moody, John E.
In this procedure one transforms variables to a basis in which the covariance isdiagonal and then projects out the low variance directions. While application of PCA to remove input variables is useful in some cases (Leen et al., 1990), there is no guarantee that low variance variables have little effect on error. We propose a saliency measure, based on PCA, that identifies those variables that have the least effect on error. Our proposed Principal Components Pruning algorithm applies this measure to obtain a simple and cheap pruning technique in the context of supervised learning. Fast Pruning Using Principal Components 37 Special Case: PCP in Linear Regression In unbiased linear models, one can bound the bias introduced from pruning the principal degrees of freedom in the model.
Fast Pruning Using Principal Components
Levin, Asriel U., Leen, Todd K., Moody, John E.
Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria
Leen, Todd K., Moody, John E.
The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and provide expressions for special cases of the LMS algorithm. The equilibrium densities are not in general thermal (Gibbs) distributions in the objective function being minimized, but rather depend upon an effective potential that includes diffusion effects. Finally we present an exact analytical expression for the time evolution of the density for a learning algorithm with weight updates proportional to the sign of the gradient.
Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria
Leen, Todd K., Moody, John E.
The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and provide expressions for special cases of the LMS algorithm. The equilibrium densities are not in general thermal (Gibbs) distributions in the objective function being minimized, but rather depend upon an effective potential that includes diffusion effects. Finally we present an exact analytical expression for the time evolution of the density for a learning algorithm with weight updates proportional to the sign of the gradient.
Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria
Leen, Todd K., Moody, John E.
The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and provide expressions for special cases of the LMS algorithm. The equilibrium densities are not in general thermal (Gibbs) distributions in the objective function being minimized,but rather depend upon an effective potential that includes diffusion effects. Finally we present an exact analytical expression for the time evolution of the density for a learning algorithm withweight updates proportional to the sign of the gradient.
Note on Learning Rate Schedules for Stochastic Optimization
Darken, Christian, Moody, John E.
We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.
Note on Learning Rate Schedules for Stochastic Optimization
Darken, Christian, Moody, John E.
We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation andk-means clustering as special cases. We introduce "search-thenconverge" typeschedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.
Note on Development of Modularity in Simple Cortical Models
Chernajvsky, Alex, Moody, John E.
We show that localized activity patterns in a layer of cells, collective excitations, can induce the formation of modular structures in the anatomical connections via a Hebbian learning mechanism. The networks are spatially homogeneous before learning, but the spontaneous emergence of localized collective excitations and subsequently modularity in the connection patterns breaks translational symmetry. This spontaneous symmetry breaking phenomenon is similar to those which drive pattern formation in reaction-diffusion systems. We have identified requirements on the patterns of lateral connections and on the gains of internal units which are essential for the development of modularity. These essential requirements will most likely remain operative when more complicated (and biologically realistic) models are considered.