Information Technology
Size of Multilayer Networks for Exact Learning: Analytic Approach
Elisseeff, André, Paugam-Moisy, Hélène
The architecture of the network is feedforward, with one hidden layer and several outputs. Starting from a fixed training set, we consider the network as a function of its weights. We derive, for a wide family of transfer functions, a lower and an upper bound on the number of hidden units for exact learning, given the size of the dataset and the dimensions of the input and output spaces. 1 RELATED WORKS The context of our work is rather similar to the well-known results of Baum et al. [1, 2,3,5, 10], but we consider both real inputs and outputs, instead ofthe dichotomies usually addressed. We are interested in learning exactly all the examples of a fixed database, hence our work is different from stating that multilayer networks are universal approximators [6, 8, 9]. Since we consider real outputs and not only dichotomies, it is not straightforward to compare our results to the recent works about the VC-dimension of multilayer networks [11, 12, 13]. Our study is more closely related to several works of Sontag [14, 15], but with different hypotheses on the transfer functions of the units. Finally, our approach is based on geometrical considerations and is close to the model of Coetzee and Stonick [4]. First we define the model of network and the notations and second we develop our analytic approach and prove the fundamental theorem. In the last section, we discuss our point of view and propose some practical consequences of the result.
Triangulation by Continuous Embedding
Meila, Marina, Jordan, Michael I.
When triangulating a belief network we aim to obtain a junction tree of minimum state space. According to (Rose, 1970), searching for the optimal triangulation can be cast as a search over all the permutations of the graph's vertices. Our approach is to embed the discrete set of permutations in a convex continuous domain D. By suitably extending the cost function over D and solving the continous nonlinear optimization task we hope to obtain a good triangulation with respect to the aformentioned cost. This paper presents two ways of embedding the triangulation problem into continuous domain and shows that they perform well compared to the best known heuristic.
Probabilistic Interpretation of Population Codes
Zemel, Richard S., Dayan, Peter, Pouget, Alexandre
We present a theoretical framework for population codes which generalizes naturally to the important case where the population provides information about a whole probability distribution over an underlying quantity rather than just a single value. We use the framework to analyze two existing models, and to suggest and evaluate a third model for encoding such probability distributions. 1 Introduction Population codes, where information is represented in the activities of whole populations ofunits, are ubiquitous in the brain. There has been substantial work on how animals should and/or actually do extract information about the underlying encoded quantity.
LSTM can Solve Hard Long Time Lag Problems
Hochreiter, Sepp, Schmidhuber, Jürgen
Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods.We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of. 1 TRIVIAL PREVIOUS LONG TIME LAG PROBLEMS Traditional recurrent nets fail in case'of long minimal time lags between input signals andcorresponding error signals [7, 3]. Many recent papers propose alternative methods, e.g., [16, 12, 1,5,9]. For instance, Bengio et ale investigate methods such as simulated annealing, multi-grid random search, time-weighted pseudo-Newton optimization, and discrete error propagation [3].
Early Brain Damage
Tresp, Volker, Neuneier, Ralph, Zimmermann, Hans-Georg
Optimal Brain Damage (OBD) is a method for reducing the number ofweights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process beforea local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection withearly stopping.
Neural Network Models of Chemotaxis in the Nematode Caenorhabditis Elegans
Ferrée, Thomas C., Marcotte, Ben A., Lockery, Shawn R.
Thomas C. Ferree, Ben A. Marcotte, Shawn R. Lockery Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403 Abstract We train recurrent networks to control chemotaxis in a computer model of the nematode C. elegans. The model presented is based closely on the body mechanics, behavioral analyses, neuroanatomy and neurophysiology of C. elegans, each imposing constraints relevant forinformation processing. Simulated worms moving autonomously insimulated chemical environments display a variety of chemotaxis strategies similar to those of biological worms. 1 INTRODUCTION The nematode C. elegans provides a unique opportunity to study the neuronal basis ofneural computation in an animal capable of complex goal-oriented behaviors. The adult hermaphrodite is only 1 mm long, and has exactly 302 neurons and 95 muscle cells. The morphology of every cell and the location of most electrical and chemical synapses are known precisely (White et al., 1986), making C. elegans especially attractivefor study.
Reinforcement Learning for Mixed Open-loop and Closed-loop Control
Hansen, Eric A., Barto, Andrew G., Zilberstein, Shlomo
Closed-loop control relies on sensory feedback that is usually assumed tobe free . But if sensing incurs a cost, it may be costeffective totake sequences of actions in open-loop mode. We describe a reinforcement learning algorithm that learns to combine open-loop and closed-loop control when sensing incurs a cost. Although weassume reliable sensors, use of open-loop control means that actions must sometimes be taken when the current state of the controlled system is uncertain. This is a special case of the hidden-state problem in reinforcement learning, and to cope, our algorithm relies on short-term memory.
Consistent Classification, Firm and Soft
A classifier is called consistent with respect to a given set of classlabeled pointsif it correctly classifies the set. We consider classifiers defined by unions of local separators and propose algorithms for consistent classifier reduction. The expected complexities of the proposed algorithms are derived along with the expected classifier sizes. In particular, the proposed approach yields a consistent reduction ofthe nearest neighbor classifier, which performs "firm" classification, assigning each new object to a class, regardless of the data structure. The proposed reduction method suggests a notion of "soft" classification, allowing for indecision with respect to objects which are insufficiently or ambiguously supported by the data. The performances of the proposed classifiers in predicting stockbehavior are compared to that achieved by the nearest neighbor method. 1 Introduction Certain classification problems, such as recognizing the digits of a hand written zipcode, requirethe assignment of each object to a class. Others, involving relatively small amounts of data and high risk, call for indecision until more data become available. Examples in such areas as medical diagnosis, stock trading and radar detection are well known. The training data for the classifier in both cases will correspond to firmly labeled members of the competing classes.