AITopics

We study online generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax functionthat need to consider the linear activations to all the output neurons. We use distance functions of a certain kind in two completely independent roles in deriving and analyzing online learning algorithms for such tasks. We use one distance function to define a matching loss function for the (possibly multidimensional) transfer function, which allows usto generalize earlier results from one-dimensional to multidimensional outputs.We use another distance function as a tool for measuring progress made by the online updates. This shows how previously studied algorithmssuch as gradient descent and exponentiated gradient fit into a common framework. We evaluate the performance of the algorithms usingrelative loss bounds that compare the loss of the online algoritm to the best off-line predictor from the relevant model class, thus completely eliminating probabilistic assumptions about the data.

algorithm, artificial intelligence, neural network, (17 more...)

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

A Neural Network Model of Naive Preference and Filial Imprinting in the Domestic Chick

Hadden, Lucy E.

Filial imprinting in domestic chicks is of interest in psychology, biology, and computational modeling because it exemplifies simple, rapid, innately programmedlearning which is biased toward learning about some objects. Hom et al. have recently discovered a naive visual preference for heads and necks which develops over the course of the first three days of life. The neurological basis of this predisposition is almost entirely unknown;that of imprinting-related learning is fairly clear. This project is the first model of the predisposition consistent with what is known about learning in imprinting. The model develops the predisposition appropriately,learns to "approach" a training object, and replicates one interaction between the two processes. Future work will replicate more interactions between imprinting and the predisposition in chicks, and analyze why the system works.

artificial intelligence, neural network, predisposition, (16 more...)

Country: North America > United States > California > San Diego County (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ryan, Jake, Lin, Meng-Jang, Miikkulainen, Risto

Intrusion Detection with Neural Networks

Intrusion detection schemes can be classified into two categories: misuse and anomaly intrusion detection. Misuse refers to known attacks that exploit the known vulnerabilities of the system. Anomaly means unusual activity in general that could indicate an intrusion.

law enforcement, neural network, vector, (18 more...)

Country:

North America > United States > Texas > Travis County > Austin (0.16)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Hopfield, John J., Brody, Carlos D., Roweis, Sam

Computing with Action Potentials

Brody t SamRoweis t Abstract Most computational engineering based loosely on biology uses continuous variablesto represent neural activity. Yet most neurons communicate with action potentials. The engineering view is equivalent to using a rate-code for representing information and for computing. An increasing numberof examples are being discovered in which biology may not be using rate codes. Information can be represented using the timing of action potentials, and efficiently computed with in this representation.

action potential, artificial intelligence, machine learning, (16 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

EM Algorithms for PCA and SPCA

Roweis, Sam T.

I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time.

algorithm, artificial intelligence, machine learning, (15 more...)

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram

Shared Context Probabilistic Transducers

Recently, a model for supervised learning of probabilistic transducers representedby suffix trees was introduced. However, this algorithm tendsto build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducermodel in which one shares the parameters of distributions associatedto contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm withcomparative experiments on inducing a noun phrase recogmzer.

artificial intelligence, natural language, transducer, (16 more...)

Country:

North America > United States (0.29)
North America > Canada > Quebec (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Modelling Seasonality and Trends in Daily Rainfall Data

Williams, Peter M.

Peter M Williams School of Cognitive and Computing Sciences University of Sussex Falmer, Brighton BN1 9QH, UK. email: peterw@cogs.susx.ac.uk Abstract This paper presents a new approach to the problem of modelling daily rainfall using neural networks. We first model the conditional distributions ofrainfall amounts, in such a way that the model itself determines the order of the process, and the time-dependent shape and scale of the conditional distributions. After integrating over particular weather patterns, weare able to extract seasonal variations and long-term trends. 1 Introduction Analysis of rainfall data is important for many agricultural, ecological and engineering activities. Design of irrigation and drainage systems, for instance, needs to take account not only of mean expected rainfall, but also of rainfall volatility. Estimates of crop yields also depend on the distribution of rainfall during the growing season, as well as on the overall amount.

artificial intelligence, modelling seasonality and trend, neural network, (15 more...)

Country: Europe > United Kingdom > England (0.28)

Industry: Food & Agriculture > Agriculture (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Movellan, Javier R., Mineiro, Paul

Bayesian Robustification for Audio Visual Fusion

Department of Cognitive Science University of California, San Diego La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems.This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore aprincipled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, andthe perceptual hypothesis and context are jointly estimated. Consequently,context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module. The approach is tested on a fixed lexicon automatic audiovisual speech recognition problem with very good results. 1 Introduction In this paper we address the problem of catastrophic fusion in automatic multimodal recognition systems.

artificial intelligence, bayesian inference, fusion, (17 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.25)
North America > United States > California > San Diego County > La Jolla (0.25)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Milostan, Jeanne C., Cottrell, Garrison W.

Serial Order in Reading Aloud: Connectionist Models and Neighborhood Structure

Besides averaging over the 30 trials per condition, each mean of these charts also averages over the two input distributionconditions and the linear and quadratic function condition, as these four cases are frequently observed violations of the statistical assumptions in nonlinear function approximationwith locally linear models. In Figure Ib the number of factors equals the underlying dimensionality of the problem, and all algorithms are essentially performing equallywell. For perfectly Gaussian distributions in all random variables (not shown separately), LWFA's assumptions are perfectly fulfilled and it achieves the best results, however, almost indistinguishable closely followed by LWPLS. For the ''unequal noise condition", the two PCA based techniques, LWPCA and LWPCR, perform the worst since--as expected-they choose suboptimal projections. However, when violating thestatistical assumptions, LWFA loses parts of its advantages, such that the summary resultsbecome fairly balanced in Figure lb. The quality of function fitting changes significantly when violating the correct number of factors, as illustrated in Figure I a,c.

artificial intelligence, neural network, pronunciation, (17 more...)

Country: North America > United States > California > San Diego County (0.14)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Vasconcelos, Nuno, Lippman, Andrew

Multiresolution Tangent Distance for Affine-invariant Classification

The ability to rely on similarity metrics invariant to image transformations isan important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multiresolution tangent distance, which exhibits significantly higher invariance to image transformations,and can be easily combined with robust estimation procedures.

artificial intelligence, image understanding, transformation, (19 more...)