Asia
Optimal Signalling in Attractor Neural Networks
Meilijson, Isaac, Ruppin, Eytan
It is well known that a given cortical neuron can respond with a different firing pattern forthe same synaptic input, depending on its firing history and on the effects of modulator transmitters (see [Connors and Gutnick, 1990] for a review). The time span of different channel conductances is very broad, and the influence of some ionic currents varies with the history of the membrane potential [Lytton, 1991]. Motivated bythe history-dependent nature of neuronal firing, we continue .our
Correlation Functions in a Large Stochastic Neural Network
Ginzburg, Iris, Sompolinsky, Haim
In many cases the crosscorrelations betweenthe activities of cortical neurons are approximately symmetric about zero time delay. These have been taken as an indication of the presence of "functional connectivity" between the correlated neurons (Fetz, Toyama and Smith 1991, Abeles 1991). However, a quantitative comparison between the observed cross-correlations and those expected to exist between neurons that are part of a large assembly of interacting population has been lacking. Most of the theoretical studies of recurrent neural network models consider only time averaged firing rates, which are usually given as solutions of mean-field equations. They do not account for the fluctuations about these averages, the study of which requires going beyond the mean-field approximations. In this work we perform a theoretical study of the fluctuations in the neuronal activities and their correlations, in a large stochastic network of excitatory and inhibitory neurons. Depending on the model parameters, this system can exhibit coherent undamped oscillations. Here we focus on parameter regimes where the system is in a statistically stationary state, which is more appropriate for modeling non oscillatory neuronal activity in cortex. Our results for the magnitudes and the time-dependence of the correlation functions can provide a basis for comparison with physiological data on neuronal correlation functions.
Solvable Models of Artificial Neural Networks
Solvable models of nonlinear learning machines are proposed, and learning in artificial neural networks is studied based on the theory of ordinary differential equations. A learning algorithm is constructed, bywhich the optimal parameter can be found without any recursive procedure. The solvable models enable us to analyze the reason why experimental results by the error backpropagation often contradict the statistical learning theory.
How to Choose an Activation Function
Mhaskar, H. N., Micchelli, C. A..
We study the complexity problem in artificial feedforward neural networks designed to approximate real valued functions of several real variables; i.e., we estimate the number of neurons in a network required to ensure a given degree of approximation to every function in a given function class. We indicate how to construct networks with the indicated number of neurons evaluating standard activation functions. Our general theorem shows that the smoother the activation function, the better the rate of approximation. 1 INTRODUCTION The approximation capabilities of feedforward neural networks with a single hidden layer has been studied by many authors, e.g., [1, 2, 5]. In [10], we have shown that such a network using practically any nonlinear activation function can approximate any continuous function of any number of real variables on any compact set to any desired degree of accuracy. A central question in this theory is the following.
An Optimization Method of Layered Neural Networks based on the Modified Information Criterion
This paper proposes a practical optimization method for layered neural networks, by which the optimal model and parameter can be found simultaneously. 'i\Te modify the conventional information criterion into a differentiable function of parameters, and then, minimize it,while controlling it back to the ordinary form. Effectiveness of this method is discussed theoretically and experimentally.
Combined Neural Networks for Time Series Analysis
We propose a method for improving the performance of any network designedto predict the next value of a time series. Vve advocate analyzing the deviations of the network's predictions from the data in the training set. This can be carried out by a secondary network trainedon the time series of these residuals. The combined system of the two networks is viewed as the new predictor. We demonstrate the simplicity and success of this method, by applying itto the sunspots data. The small corrections of the secondary network can be regarded as resulting from a Taylor expansion of a complex network which includes the combined system.
Supervised learning from incomplete data via an EM approach
Ghahramani, Zoubin, Jordan, Michael I.
Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data set.s. VVe use mixture models for the density estimatesand make two distinct appeals to the Expectation Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm-EM is used both for the estimation of mixture componentsand for coping wit.h missing dat.a. The resulting algorithm is applicable t.o a wide range of supervised as well as unsupervised learning problems.
Wrap-Up: a Trainable Discourse Module for Information Extraction
The vast amounts of on-line text now available have ledto renewed interest in information extraction (IE) systems thatanalyze unrestricted text, producing a structured representation ofselected information from the text. This paper presents a novel approachthat uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse component that makes intersentential inferences and identifies logicalrelations among information extracted from the text. Previous corpus-based approaches were limited to lower level processing such as part-of-speech tagging, lexical disambiguation, and dictionary construction. Wrap-Up is fully trainable, and not onlyautomatically decides what classifiers are needed, but even derives the featureset for each classifier automatically. Performance equals that of a partially trainable discourse module requiring manual customization for each domain.
Operations for Learning with Graphical Models
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Well-known examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, andthe manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximizationalgorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feed-forward networks, and learning Gaussian and discrete Bayesian networks from data. The paper concludes by sketching some implications for data analysis and summarizing how some popular algorithms fall within the framework presented. The main original contributions here are the decompositiontechniques and the demonstration that graphical models provide a framework for understanding and developing complex learning algorithms.