Discovering High Order Features with Mean Field Modules
Galland, Conrad C., Hinton, Geoffrey E.
–Neural Information Processing Systems
A new form of the deterministic Boltzmann machine (DBM) learning procedure is presented which can efficiently train network modules to discriminate between input vectors according to some criterion. The new technique directly utilizes the free energy of these "mean field modules" to represent the probability that the criterion is met, the free energy being readily manipulated by the learning procedure. Although conventional deterministic Boltzmann learning fails to extract the higher order feature of shift at a network bottleneck, combining the new mean field modules with the mutual information objective function rapidly produces modules that perfectly extract this important higher order feature without direct external supervision. 1 INTRODUCTION The Boltzmann machine learning procedure (Hinton and Sejnowski, 1986) can be made much more efficient by using a mean field approximation in which stochastic binary units are replaced by deterministic real-valued units (Peterson and Anderson, 1987). Deterministic Boltzmann learning can be used for "multicompletion" tasks in which the subsets of the units that are treated as input or output are varied from trial to trial (Peterson and Hartman, 1988). In this respect it resembles other learning procedures that also involve settling to a stable state (Pineda, 1987). Using the multicompletion paradigm, it should be possible to force a network to explicitly extract important higher order features of an ensemble of training vectors by forcing the network to pass the information required for correct completions through a narrow bottleneck. In back-propagation networks with two or three hidden layers, the use of bottlenecks sometimes allows the learning to explictly discover important.
Neural Information Processing Systems
Dec-31-1990