AITopics

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove the noise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence which is the same as for annealing, Le., at best proportional to l/T.

artificial intelligence, batch size, machine learning, (13 more...)

Country:

North America > United States > Oregon (0.14)
North America > United States > California (0.14)

Industry: Education > Educational Setting > Online (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kowalczyk, Adam, Ferrá, Herman L.

MLP Can Provably Generalize Much Better than VC-bounds Indicate

It is also shown that bounds following the true learning curve can be derived from a formalism based on the density of error patterns.

artificial intelligence, neural network, thermodynamic limit, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.33)

Bös, Siegfried, Opper, Manfred

Dynamics of Training

A new method to calculate the full training process of a neural network is introduced. No sophisticated methods like the replica trick are used. The results are directly related to the actual number of training steps. Some results are presented here, like the maximal learning rate, an exact description of early stopping, and the necessary number of training steps. Further problems can be addressed with this approach.

artificial intelligence, neural network, training step, (16 more...)

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Saito, Kazumi, Nakano, Ryohei

Second-order Learning Algorithm with Squared Penalty Term

This paper compares three penalty terms with respect to the efficiency of supervised learning, by using first-and second-order learning algorithms. Our experiments showed that for a reasonably adequate penalty factor, the combination of the squared penalty term and the second-order learning algorithm drastically improves the convergence performance more than 20 times over the other combinations, at the same time bringing about a better generalization performance.

artificial intelligence, neural network, penalty term, (15 more...)

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Parberry, Ian, Tseng, Hung-Li

Are Hopfield Networks Faster than Conventional Computers?

It is shown that conventional computers can be exponentiallx faster than planar Hopfield networks: although there are planar Hopfield networks that take exponential time to converge, a stable state of an arbitrary planar Hopfield network can be found by a conventional computer in polynomial time.

artificial intelligence, hopfield network, neural network, (15 more...)

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Barber, David, Williams, Christopher K. I.

Gaussian Processes for Bayesian Classification via Hybrid Monte Carlo

The full Bayesian method for applying neural networks to a prediction problem is to set up the prior/hyperprior structure for the net and then perform the necessary integrals. However, these integrals are not tractable analytically, and Markov Chain Monte Carlo (MCMC) methods are slow, especially if the parameter space is high-dimensional. Using Gaussian processes we can approximate the weight space integral analytically, so that only a small number of hyperparameters need be integrated over by MCMC methods. We have applied this idea to classification problems, obtaining excellent results on the real-world problems investigated so far. 1 INTRODUCTION To make predictions based on a set of training data, fundamentally we need to combine our prior beliefs about possible predictive functions with the data at hand. In the Bayesian approach to neural networks a prior on the weights in the net induces a prior distribution over functions.

artificial intelligence, bayesian inference, hyperparameter, (16 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Horiuchi, Timothy K., Morris, Tonia G., Koch, Christof, DeWeerth, Stephen P.

Analog VLSI Circuits for Attention-Based, Visual Tracking

A one-dimensional visual tracking chip has been implemented using neuromorphic, analog VLSI techniques to model selective visual attention in the control of saccadic and smooth pursuit eye movements. The chip incorporates focal-plane processing to compute image saliency and a winner-take-all circuit to select a feature for tracking. The target position and direction of motion are reported as the target moves across the array. We demonstrate its functionality in a closed-loop system which performs saccadic and smooth pursuit tracking movements using a one-dimensional mechanical eye.

analog vlsi circuit, artificial intelligence, attention-based, (12 more...)

Country: North America > United States > California (0.16)

Industry: Semiconductors & Electronics (0.66)

Technology: Information Technology > Artificial Intelligence (0.47)

Ghosn, Joumana, Bengio, Yoshua

Multi-Task Learning for Stock Selection

Artificial Neural Networks can be used to predict future returns of stocks in order to take financial decisions. Should one build a separate network for each stock or share the same network for all the stocks? In this paper we also explore other alternatives, in which some layers are shared and others are not shared. When the prediction of future returns for different stocks are viewed as different tasks, sharing some parameters across stocks is a form of multi-task learning. In a series of experiments with Canadian stocks, we obtain yearly returns that are more than 14% above various benchmarks.

banking & finance, benchmark, neural network, (18 more...)

Country: North America > United States > California (0.28)

Industry: Banking & Finance > Trading (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Clouse, Daniel S., Giles, C. Lee, Horne, Bill G., Cottrell, Garrison W.

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

This work investigates the representational and inductive capabilities of time-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN). Both architectures are capable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows.

artificial intelligence, idnn, neural network, (15 more...)

Country:

North America > United States > California > San Diego County (0.15)
North America > United States > New Jersey > Mercer County > Princeton (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hochreiter, Sepp, Schmidhuber, Jürgen

LSTM can Solve Hard Long Time Lag Problems

Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of.

deep learning, memory cell, neural network, (20 more...)

Country:

North America > United States (0.47)
Europe (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)