Goto

Collaborating Authors

 Educational Setting


Removing Noise in On-Line Search using Adaptive Batch Sizes

Neural Information Processing Systems

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove thenoise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence whichis the same as for annealing, Le., at best proportional to l/T. 1 Introduction Stochastic (online) learning can speed learning over its batch training particularly,,,,hen data sets are large and contain redundant information [M0l93J. However, at late times in learning, noise present in the weight updates prevents complete convergence fromtaking place. To reduce the noise, the learning rate is slowly decreased (annealed{ at late times. The optimal annealing schedule is asymptotically proportional toT where t is the iteration [GoI87, L093, Orr95J.


Adaptive On-line Learning in Changing Environments

Neural Information Processing Systems

An adaptive online algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flow information it can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available. Its efficiency is demonstrated for a non-stationary blind separation task of acoustic signals.


Removing Noise in On-Line Search using Adaptive Batch Sizes

Neural Information Processing Systems

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove the noise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence which is the same as for annealing, Le., at best proportional to l/T.


Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient

Neural Information Processing Systems

The parameter space of neural networks has a Riemannian metric structure. The natural Riemannian gradient should be used instead of the conventional gradient, since the former denotes the true steepest descent direction of a loss function in the Riemannian space. The behavior of the stochastic gradient learning algorithm is much more effective if the natural gradient is used. The present paper studies the information-geometrical structure of perceptrons and other networks, and prove that the online learning method based on the natural gradient is asymptotically as efficient as the optimal batch algorithm. Adaptive modification of the learning constant is proposed and analyzed in terms of the Riemannian measure and is shown to be efficient.


Calendar of Events

AI Magazine

Autonomous agents are computer systems that are capable of independent action in dynamic, unpredictable environments. Agents are also one of the most important and exciting areas of research and development in computer science today. Agents are currently being applied in domains as diverse as computer games and interactive cinema, information retrieval and filtering, user interface design, and industrial process control. Agents '98 will build on the enormous success of the First International Conference on Autonomous Agents (Agents '97), held in Marina del Rey in February 1997. The conference welcomes submissions of original, high quality papers and videos with summaries concerning autonomous agents in a variety of embodiments and playing a variety of roles in their environments.


Kansas State's Slick Willie Robot Software

AI Magazine

Robotics Team 1 from Kansas State University was the team that perfectly completed the Office Navigation event in the shortest time at the fifth Annual AAAI Mobile Robot Competition and Exhibition, held as part of the Thirteenth National Conference on Artificial Intelligence. The team, consisting of Michael Novak and Darrel Fossett, developed its code in an undergraduate software-engineering course. Its C code used multiple threads to provide separate autonomous agents to solve the meeting scheduling task, control the sonar sensors, and control the actual robot motion. The team's robot software was nicknamed SLICK WILLIE for the way it gracefully moved through doorways and around obstacles.


Yoda: The Young Observant Discovery Agent

AI Magazine

The YODA Robot Project at the University of Southern California/Information Sciences Institute consists of a group of young researchers who share a passion for autonomous systems that can bootstrap its knowledge from real environments by exploration, experimentation, learning, and discovery. Our goal is to create a mobile agent that can autonomously learn from its environment based on its own actions, percepts, and mis-sions. Our participation in the Fifth Annual AAAI Mobile Robot Competition and Exhibition, held as part of the Thirteenth National Conference on Artificial Intelligence, served as the first milestone in advancing us toward this goal. YODA's software architecture is a hierarchy of abstraction layers, ranging from a set of behaviors at the bottom layer to a dynamic, mission-oriented planner at the top. The planner uses a map of the environment to determine a sequence of goals to be accomplished by the robot and delegates the detailed executions to the set of behaviors at the lower layer. This abstraction architecture has proven robust in dynamic and noisy environments, as shown by YODA's performance at the robot competition.


Adaptive Back-Propagation in On-Line Learning of Multilayer Networks

Neural Information Processing Systems

This research has been motivated by the dominance of the suboptimal symmetric phase in online learning of two-layer feedforward networks trained by gradient descent [2]. This trapping is emphasized for inappropriate small learning rates but exists in all training scenarios, effecting the learning process considerably. We Adaptive Back-Propagation in Online Learning of Multilayer Networks 329 proposed an adaptive back-propagation training algorithm [Eq.


Adaptive Back-Propagation in On-Line Learning of Multilayer Networks

Neural Information Processing Systems

This research has been motivated by the dominance of the suboptimal symmetric phase in online learning of two-layer feedforward networks trained by gradient descent [2]. This trapping is emphasized for inappropriate small learning rates but exists in all training scenarios, effecting the learning process considerably. We Adaptive Back-Propagation in Online Learning of Multilayer Networks 329 proposed an adaptive back-propagation training algorithm [Eq.


On-line Learning of Dichotomies

Neural Information Processing Systems

The performance of online algorithms for learning dichotomies is studied. In online learning, the number of examples P is equivalent to the learning time, since each example is presented only once. The learning curve, or generalization error as a function of P, depends on the schedule at which the learning rate is lowered.