Technology
Rules and Similarity in Concept Learning
This paper argues that two apparently distinct modes of generalizing concepts - abstracting rules and computing similarity to exemplars - should both be seen as special cases of a more general Bayesian learning framework. Bayes explains the specific workings of these two modes - which rules are abstracted, how similarity is measured - as well as why generalization should appear rule-or similarity-based in different situations. This analysis also suggests why the rules/similarity distinction, even if not computationally fundamental, may still be useful at the algorithmic level as part of a principled approximation to fully Bayesian learning.
Graded Grammaticality in Prediction Fractal Machines
Parfitt, Shan, Tiño, Peter, Dorffner, Georg
We introduce a novel method of constructing language models, which avoids some of the problems associated with recurrent neural networks. The method of creating a Prediction Fractal Machine (PFM) [1] is briefly described and some experiments are presented which demonstrate the suitability of PFMs for language modeling. PFMs distinguish reliably between minimal pairs, and their behavior is consistent with the hypothesis [4] that wellformedness is'graded' not absolute. A discussion of their potential to offer fresh insights into language acquisition and processing follows.
Perceptual Organization Based on Temporal Dynamics
A figure-ground segregation network is proposed based on a novel boundary pair representation. Nodes in the network are boundary segments obtained through local grouping. Each node is excitatorily coupled with the neighboring nodes that belong to the same region, and inhibitorily coupled with the corresponding paired node. Gestalt grouping rules are incorporated by modulating connections. The status of a node represents its probability being figural and is updated according to a differential equation.
Robust Recognition of Noisy and Superimposed Patterns via Selective Attention
Lee, Soo-Young, Mozer, Michael C.
In many classification tasks, recognition accuracy is low because input patterns are corrupted by noise or are spatially or temporally overlapping. We propose an approach to overcoming these limitations based on a model of human selective attention. The model, an early selection filter guided by top-down attentional control, entertains each candidate output class in sequence and adjusts attentional gain coefficients in order to produce a strong response for that class. The chosen class is then the one that obtains the strongest response with the least modulation of attention. We present simulation results on classification of corrupted and superimposed handwritten digit patterns, showing a significant improvement in recognition rates.
Effects of Spatial and Temporal Contiguity on the Acquisition of Spatial Information
Ghiselli-Crippa, Thea B., Munro, Paul W.
Spatial information comes in two forms: direct spatial information (for example, retinal position) and indirect temporal contiguity information, since objects encountered sequentially are in general spatially close. The acquisition of spatial information by a neural network is investigated here. Given a spatial layout of several objects, networks are trained on a prediction task. Networks using temporal sequences with no direct spatial information are found to develop internal representations that show distances correlated with distances in the external layout. The influence of spatial information is analyzed by providing direct spatial information to the system during training that is either consistent with the layout or inconsistent with it. This approach allows examination of the relative contributions of spatial and temporal contiguity.
Acquisition in Autoshaping
However, most models have simply ignored these data; the few that have attempted to address them have failed by at least an order of magnitude. We discuss key data on the speed of acquisition, and show how to account for them using a statistically sound model of learning, in which differential reliabilities of stimuli playa crucial role. 1 Introduction Conditioning experiments probe the ways that animals make predictions about rewards and punishments and how those predictions are used to their advantage. Substantial quantitative data are available as to how pigeons and rats acquire conditioned responses during autoshaping, which is one of the simplest paradigms of classical conditioning.
A Neurodynamical Approach to Visual Attention
The psychophysical evidence for "selective attention" originates mainly from visual search experiments. In this work, we formulate a hierarchical system of interconnected modules consisting in populations of neurons for modeling the underlying mechanisms involved in selective visual attention. We demonstrate that our neural system for visual search works across the visual field in parallel but due to the different intrinsic dynamics can show the two experimentally observed modes of visual attention, namely: the serial and the parallel search mode. In other words, neither explicit model of a focus of attention nor saliencies maps are used. The focus of attention appears as an emergent property of the dynamic behavior of the system. The neural population dynamics are handled in the framework of the mean-field approximation. Consequently, the whole process can be expressed as a system of coupled differential equations.
Scale Mixtures of Gaussians and the Statistics of Natural Images
Wainwright, Martin J., Simoncelli, Eero P.
The statistics of photographic images, when represented using multiscale (wavelet) bases, exhibit two striking types of non Gaussian behavior. First, the marginal densities of the coefficients have extended heavy tails. Second, the joint densities exhibit variance dependencies not captured by second-order models. We examine properties of the class of Gaussian scale mixtures, and show that these densities can accurately characterize both the marginal and joint distributions of natural image wavelet coefficients. This class of model suggests a Markov structure, in which wavelet coefficients are linked by hidden scaling variables corresponding to local image structure. We derive an estimator for these hidden variables, and show that a nonlinear "normalization" procedure can be used to Gaussianize the coefficients.
Optimal Sizes of Dendritic and Axonal Arbors
I consider a topographic projection between two neuronal layers with different densities of neurons. Given the number of output neurons connected to each input neuron (divergence or fan-out) and the number of input neurons synapsing on each output neuron (convergence or fan-in) I determine the widths of axonal and dendritic arbors which minimize the total volume ofaxons and dendrites. My analytical results can be summarized qualitatively in the following rule: neurons of the sparser layer should have arbors wider than those of the denser layer. This agrees with the anatomical data from retinal and cerebellar neurons whose morphology and connectivity are known. The rule may be used to infer connectivity of neurons from their morphology.