Country
When will a Genetic Algorithm Outperform Hill Climbing
Mitchell, Melanie, Holland, John H., Forrest, Stephanie
We analyze a simple hill-climbing algorithm (RMHC) that was previously shown to outperform a genetic algorithm (GA) on a simple "Royal Road" function. We then analyze an "idealized" genetic algorithm (IGA) that is significantly faster than RMHC and that gives a lower bound for GA speed. We identify the features of the IGA that give rise to this speedup, and discuss how these features can be incorporated into a real GA. 1 INTRODUCTION Our goal is to understand the class of problems for which genetic algorithms (GA) are most suited, and in particular, for which they will outperform other search algorithms. Several studies have empirically compared GAs with other search and optimization methods such as simple hill-climbing (e.g., Davis, 1991), simulated annealing (e.g., Ingber & Rosen, 1992), linear, nonlinear, and integer programming techniques, and other traditional optimization techniques (e.g., De Jong, 1975). However, such comparisons typically compare one version of the GA with a second algorithm on a single problem or set of problems, often using performance criteria which may not be appropriate.
Unsupervised Learning of Mixtures of Multiple Causes in Binary Data
This paper presents a formulation for unsupervised learning of clusters reflecting multiple causal structure in binary data. Unlike the standard mixture model, a multiple cause model accounts for observed data by combining assertions from many hidden causes, each of which can pertain to varying degree to any subset of the observable dimensions. A crucial issue is the mixing-function for combining beliefs from different cluster-centers in order to generate data reconstructions whose errors are minimized both during recognition and learning. We demonstrate a weakness inherent to the popular weighted sum followed by sigmoid squashing, and offer an alternative form of the nonlinearity. Results are presented demonstrating the algorithm's ability successfully to discover coherent multiple causal representat.ions of noisy test data and in images of printed characters. 1 Introduction The objective of unsupervised learning is to identify patterns or features reflecting underlying regularities in data. Single-cause techniques, including the k-means algorithm and the standard mixture-model (Duda and Hart, 1973), represent clusters of data points sharing similar patterns of Is and Os under the assumption that each data point belongs to, or was generated by, one and only one cluster-center; output activity is constrained to sum to 1. In contrast, a multiple-cause model permits more than one cluster-center to become fully active in accounting for an observed data vector.
A Unified Gradient-Descent/Clustering Architecture for Finite State Machine Induction
Das, Sreerupa, Mozer, Michael C.
Researchers often try to understand-post hoc-representations that emerge in the hidden layers of a neural net following training. Interpretation is difficult because these representations are typically highly distributed and continuous. By "continuous," we mean that if one constructed a scatterplot over the hidden unit activity space of patterns obtained in response to various inputs, examination at any scale would reveal the patterns to be broadly distributed over the space.
Developing Population Codes by Minimizing Description Length
Zemel, Richard S., Hinton, Geoffrey E.
The Minimum Description Length principle (MDL) can be used to train the hidden units of a neural network to extract a representation that is cheap to describe but nonetheless allows the input to be reconstructed accurately. We show how MDL can be used to develop highly redundant population codes. Each hidden unit has a location in a low-dimensional implicit space. If the hidden unit activities form a bump of a standard shape in this space, they can be cheaply encoded by the center ofthis bump. So the weights from the input units to the hidden units in an autoencoder are trained to make the activities form a standard bump.
Processing of Visual and Auditory Space and Its Modification by Experience
Rauschecker, Josef P., Sejnowski, Terrence J.
Sejnowski Computational Neurobiology Lab The Salk: Institute San Diego, CA 92138 Visual spatial information is projected from the retina to the brain in a highly topographic fashion, so that 2-D visual space is represented in a simple retinotopic map. Auditory spatial information, by contrast, has to be computed from binaural time and intensity differences as well as from monaural spectral cues produced by the head and ears. Evaluation of these cues in the central nervous system leads to the generation of neurons that are sensitive to the location of a sound source in space ("spatial tuning") and, in some animal species, to auditory space maps where spatial location is encoded as a 2-D map just like in the visual system. The brain structures thought to be involved in the multimodal integration of visual and auditory spatial integration are the superior colliculus in the midbrain and the inferior parietal lobe in the cerebral cortex. It has been suggested for the owl that the visual system participates in setting up the auditory space map in the superior.
Connectionist Modeling and Parallel Architectures
Diederich, Joachim, Tsoi, Ah Chung
University of Rochester) and ICSIM (lCSI Berkeley) allow the definition of unit types and complex connectivity patterns. On a very high level of abstraction, simulators like tleam (UCSD) allow the easy realization of predefined network architectures (feedforwardnetworks) and leaming algorithms such as backpropagation. Ben Gomes, International Computer Science Institute (Berkeley) introduced the Connectionist Supercomputer 1. The CNSl is a multiprocessor system designed for moderate precision fixed point operations used extensively in connectionist network calculations. Custom VLSI digital processors employ an on-chip vector coprocessor unit tailored for neural network calculations and controlled by RISC scalar CPU. One processor and associated commercial DRAM comprise a node, which is connected in a mesh topology with other nodes to establish a MIMD array. One edge of the communications meshis reserved for attaching various 110 devices, which connect via a custom network adaptor chip. The CNSl operates as a compute server and one 110 port is used for connecting to a host workstation. Users with mainstream connectionist applications can use CNSim, an object-oriented, graphical high-level interface to the CNSl environment.
Computational Elements of the Adaptive Controller of the Human Arm
Shadmehr, Reza, Mussa-Ivaldi, Ferdinando A.
We consider the problem of how the CNS learns to control dynamics ofa mechanical system. By using a paradigm where a subject's hand interacts with a virtual mechanical environment, we show that learning control is via composition of a model of the imposed dynamics. Some properties of the computational elements with which the CNS composes this model are inferred through the generalization capabilitiesof the subject outside the training data. 1 Introduction At about the age of three months, children become interested in tactile exploration of objects around them. They attempt to reach for an object, but often fail to properly control their arm and end up missing their target. In the ensuing weeks, they rapidly improve and soon they can not only reach accurately, they can also pick up the object and place it.