Goto

Collaborating Authors

 Asia


Diffusion of Context and Credit Information in Markovian Models

Journal of Artificial Intelligence Research

This paper studies the problem of ergodicity of transition probability matrices in Markovian models, such as hidden Markov models (HMMs), and how it makes very difficult the task of learning to represent long-term context for sequential data. This phenomenon hurts the forward propagation of long-term context information, as well as learning a hidden state representation to represent long-term context, which depends on propagating credit information backwards in time. Using results from Markov chain theory, we show that this problem of diffusion of context and credit is reduced when the transition probabilities approach 0 or 1, i.e., the transition probability matrices are sparse and the model essentially deterministic. The results found in this paper apply to learning approaches based on continuous optimization, such as gradient descent and the Baum-Welch algorithm.



The Second International Conference on Conceptual Structures

AI Magazine

Prizes were awarded to students to encourage improved research. Michel Wermelinger, Universidade Nova de Lisboa, Portugal, was the winner of the best paper award for his work "Basic Conceptual Structure Theory," which provided a significant In "Representations Technology, Bangkok, Thailand, won Papers were presented by a number interest in the use of conceptual he Second International Conference (ICCS'94) was held at the of individuals and groups from graphs. The funds were made available University of Maryland, College several countries on the development through a grant from the American Park, Maryland, on August 16 to 20. and use of the conceptual Association for Artificial Intelligence The conference marked the tenth graph representational language. Sponsors included the University of Graph Workbench," chaired by Gerard vice-president of academic affairs, Paradigm Development Corp. in Urbana, Illinois, was the second She received her Ph.D. from the introduction, "Aristotelian and


Eighth International Workshop on Qualitative Reasoning about Physical Systems

AI Magazine

Systems (QR '94) was held on 7-10 June A hot issue in cognitive modeling We received 53 submissions and is spatial and diagrammatic reasoning. The core issues of qualitative reasoning Hari Narayanan and his colleagues The eighth workshop was in Nara, included qualitative and (Advanced Research Laboratory, Japan, celebrating the community's causal modeling of the world, automated Hitachi Ltd.) exploited an architecture escape from a simple flip-flop behavior modeling, and qualitative of qualitative visual reasoning and its voyage to a more complex simulation. Interestingly, this transition attracted the attention of many participants. In fact, constructing a component-based sophistication to base qualitative several demonstrations, including model for the input-document handler reasoning on a firm ground. University) presented activity analysis, model abstraction that makes test Iwasaki and Farquhar and will be demonstrating how qualitative generation feasible for continuous held in Monterey, California.


Operational Rationality through Compilation of Anytime Algorithms

AI Magazine

How can an artificial agent react to a situation after performing the correct amount of thinking? My Ph.D. dissertation (Zilberstein 1993)2 presents a theoretical framework and a programming paradigm that provide an answer to this question.


Adaptive Load Balancing: A Study in Multi-Agent Learning

Journal of Artificial Intelligence Research

We study the process of multi-agent reinforcement learning in the context ofload balancing in a distributed system, without use of either centralcoordination or explicit communication. We first define a precise frameworkin which to study adaptive load balancing, important features of which are itsstochastic nature and the purely local information available to individualagents. Given this framework, we show illuminating results on the interplaybetween basic adaptive behavior parameters and their effect on systemefficiency. We then investigate the properties of adaptive load balancing inheterogeneous populations, and address the issue of exploration vs.exploitation in that context. Finally, we show that naive use ofcommunication may not improve, and might even harm system efficiency.


Intelligent Agents for Interactive Simulation Environments

AI Magazine

Interactive simulation environments constitute one of today's promising emerging technologies, with applications in areas such as education, manufacturing, entertainment, and training. These environments are also rich domains for building and investigating intelligent automated agents, with requirements for the integration of a variety of agent capabilities but without the costs and demands of low-level perceptual processing or robotic control. Our project is aimed at developing humanlike, intelligent agents that can interact with each other, as well as with humans, in such virtual environments. Our current target is intelligent automated pilots for battlefield-simulation environments. These dynamic, interactive, multiagent environments pose interesting challenges for research on specialized agent capabilities as well as on the integration of these capabilities in the development of "complete" pilot agents. We are addressing these challenges through development of a pilot agent, called TacAir-Soar, within the Soar architecture. This article provides an overview of this domain and project by analyzing the challenges that automated pilots face in battlefield simulations, describing how TacAir-Soar is successfully able to address many of them -- TacAir-Soar pilots have already successfully participated in constrained air-combat simulations against expert human pilots -- and discussing the issues involved in resolving the remaining research challenges.


How to Choose an Activation Function

Neural Information Processing Systems

In [10], we have shown that such a network using practically any nonlinear activation function can approximate any continuous function of any number of real variables on any compact set to any desired degree of accuracy. A central question in this theory is the following. If one needs to approximate a function from a known class of functions to a prescribed accuracy, how many neurons will be necessary to accomplish this approximation for all functions in the class?


Autoencoders, Minimum Description Length and Helmholtz Free Energy

Neural Information Processing Systems

An autoencoder network uses a set of recognition weights to convert an input vector into a code vector. It then uses a set of generative weights to convert the code vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the code vector and the reconstruction error. We show that this information is minimized by choosing code vectors stochastically according to a Boltzmann distribution, where the generative weights define the energy of each possible code vector given the input vector. Unfortunately, if the code vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible code vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and that this approximation gives an upper bound on the description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn factorial codes.


A Hodgkin-Huxley Type Neuron Model That Learns Slow Non-Spike Oscillation

Neural Information Processing Systems

A gradient descent algorithm for parameter estimation which is similar to those used for continuous-time recurrent neural networks was derived for Hodgkin-Huxley type neuron models. Using membrane potential trajectories as targets, the parameters (maximal conductances, thresholds and slopes of activation curves, time constants) were successfully estimated. The algorithm was applied to modeling slow non-spike oscillation of an identified neuron in the lobster stomatogastric ganglion. A model with three ionic currents was trained with experimental data. It revealed a novel role of A-current for slow oscillation below -50 mY. 1 INTRODUCTION Conductance-based neuron models, first formulated by Hodgkin and Huxley [10], are commonly used for describing biophysical mechanisms underlying neuronal behavior.