Country
Relative Loss Bounds for Multidimensional Regression Problems
Kivinen, Jyrki, Warmuth, Manfred K.
We study online generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax functionthat need to consider the linear activations to all the output neurons. We use distance functions of a certain kind in two completely independent roles in deriving and analyzing online learning algorithms for such tasks. We use one distance function to define a matching loss function for the (possibly multidimensional) transfer function, which allows usto generalize earlier results from one-dimensional to multidimensional outputs.We use another distance function as a tool for measuring progress made by the online updates. This shows how previously studied algorithmssuch as gradient descent and exponentiated gradient fit into a common framework. We evaluate the performance of the algorithms usingrelative loss bounds that compare the loss of the online algoritm to the best off-line predictor from the relevant model class, thus completely eliminating probabilistic assumptions about the data.
A Neural Network Model of Naive Preference and Filial Imprinting in the Domestic Chick
Filial imprinting in domestic chicks is of interest in psychology, biology, and computational modeling because it exemplifies simple, rapid, innately programmedlearning which is biased toward learning about some objects. Hom et al. have recently discovered a naive visual preference for heads and necks which develops over the course of the first three days of life. The neurological basis of this predisposition is almost entirely unknown;that of imprinting-related learning is fairly clear. This project is the first model of the predisposition consistent with what is known about learning in imprinting. The model develops the predisposition appropriately,learns to "approach" a training object, and replicates one interaction between the two processes. Future work will replicate more interactions between imprinting and the predisposition in chicks, and analyze why the system works.
Computing with Action Potentials
Hopfield, John J., Brody, Carlos D., Roweis, Sam
Brody t SamRoweis t Abstract Most computational engineering based loosely on biology uses continuous variablesto represent neural activity. Yet most neurons communicate with action potentials. The engineering view is equivalent to using a rate-code for representing information and for computing. An increasing numberof examples are being discovered in which biology may not be using rate codes. Information can be represented using the timing of action potentials, and efficiently computed with in this representation.
Shared Context Probabilistic Transducers
Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram
Recently, a model for supervised learning of probabilistic transducers representedby suffix trees was introduced. However, this algorithm tendsto build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducermodel in which one shares the parameters of distributions associatedto contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm withcomparative experiments on inducing a noun phrase recogmzer.
Modelling Seasonality and Trends in Daily Rainfall Data
Peter M Williams School of Cognitive and Computing Sciences University of Sussex Falmer, Brighton BN1 9QH, UK. email: peterw@cogs.susx.ac.uk Abstract This paper presents a new approach to the problem of modelling daily rainfall using neural networks. We first model the conditional distributions ofrainfall amounts, in such a way that the model itself determines the order of the process, and the time-dependent shape and scale of the conditional distributions. After integrating over particular weather patterns, weare able to extract seasonal variations and long-term trends. 1 Introduction Analysis of rainfall data is important for many agricultural, ecological and engineering activities. Design of irrigation and drainage systems, for instance, needs to take account not only of mean expected rainfall, but also of rainfall volatility. Estimates of crop yields also depend on the distribution of rainfall during the growing season, as well as on the overall amount.
Bayesian Robustification for Audio Visual Fusion
Movellan, Javier R., Mineiro, Paul
Department of Cognitive Science University of California, San Diego La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems.This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore aprincipled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, andthe perceptual hypothesis and context are jointly estimated. Consequently,context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module. The approach is tested on a fixed lexicon automatic audiovisual speech recognition problem with very good results. 1 Introduction In this paper we address the problem of catastrophic fusion in automatic multimodal recognition systems.
Serial Order in Reading Aloud: Connectionist Models and Neighborhood Structure
Milostan, Jeanne C., Cottrell, Garrison W.
Besides averaging over the 30 trials per condition, each mean of these charts also averages over the two input distributionconditions and the linear and quadratic function condition, as these four cases are frequently observed violations of the statistical assumptions in nonlinear function approximationwith locally linear models. In Figure Ib the number of factors equals the underlying dimensionality of the problem, and all algorithms are essentially performing equallywell. For perfectly Gaussian distributions in all random variables (not shown separately), LWFA's assumptions are perfectly fulfilled and it achieves the best results, however, almost indistinguishable closely followed by LWPLS. For the ''unequal noise condition", the two PCA based techniques, LWPCA and LWPCR, perform the worst since--as expected-they choose suboptimal projections. However, when violating thestatistical assumptions, LWFA loses parts of its advantages, such that the summary resultsbecome fairly balanced in Figure lb. The quality of function fitting changes significantly when violating the correct number of factors, as illustrated in Figure I a,c.
Multiresolution Tangent Distance for Affine-invariant Classification
Vasconcelos, Nuno, Lippman, Andrew
The ability to rely on similarity metrics invariant to image transformations isan important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multiresolution tangent distance, which exhibits significantly higher invariance to image transformations,and can be easily combined with robust estimation procedures.