Goto

Collaborating Authors

 Country


Learning Macro-Actions in Reinforcement Learning

Neural Information Processing Systems

We present a method for automatically constructing macro-actions from scratch from primitive actions during the reinforcement learning process. The overall idea is to reinforce the tendency to perform action b after action a if such a pattern of actions has been rewarded. We test the method on a bicycle task, the car-on-the-hill task, the racetrack task and some grid-world tasks. For the bicycle and racetrack tasks the use of macro-actions approximately halves the learning time, while for one of the grid-world tasks the learning time is reduced by a factor of 5. The method did not work for the car-on-the-hill task for reasons we discuss in the conclusion. 1 INTRODUCTION A macro-action is a sequence of actions chosen from the primitive actions of the problem.



Learning Lie Groups for Invariant Visual Perception

Neural Information Processing Systems

One of the most important problems in visual perception is that of visual invariance: howare objects perceived to be the same despite undergoing transformations such as translations, rotations or scaling? In this paper, we describe a Bayesian method for learning invariances based on Lie group theory. We show that previous approaches based on first-order Taylor series expansions of inputs can be regarded as special cases of the Lie group approach, the latter being capable ofhandling in principle arbitrarily large transfonnations. Using a matrixexponential basedgenerative model of images, we derive an unsupervised algorithm for learning Lie group operators from input data containing infinitesimal transfonnations.


Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy

Neural Information Processing Systems

In contrast to gradient descentand EM, which estimate the mixture's covariance matrices, the proposed method estimates the inverses of the covariance matrices. Furthennore, the new parameter estimation procedure can be applied in both online and batch settings. We show experimentally that it is typically fasterthan EM, and usually requires about half as many iterations as EM. 1 Introduction Mixture models, in particular mixtures of Gaussians, have been a popular tool for density estimation, clustering, and unsupervised learning with a wide range of applications (see for instance [5, 2] and the references therein). Mixture models are one of the most useful tools for handling incomplete data, in particular hidden variables. For Gaussian mixtures the hidden variables indicate for each data point the index of the Gaussian that generated it.


Graphical Models for Recognizing Human Interactions

Neural Information Processing Systems

We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in two different scenarios: (1) complex, twohanded actionrecognition in the martial art of Tai Chi and (2) detection and recognition of individual human behaviors and multiple-person interactions in a visual surveillance task. In the latter case, the system is particularly concerned with detecting when interactions between people occur, and classifying them. Graphical models, such as Hidden Markov Models (HMMs) [6] and Coupled Hidden MarkovModels (CHMMs) [3, 2], seem appropriate for modeling and, classifying human behaviors because they offer dynamic time warping, a well-understood training algorithm, and a clear Bayesian semantics for both individual (HMMs) and interacting or coupled (CHMMs) generative processes. A major problem with this data-driven statistical approach, especially when modeling rare or anomalous behaviors, is the limited number of training examples. A major emphasis of our work, therefore, is on efficient Bayesian integration of both prior knowledge with evidence from data.


Linear Hinge Loss and Average Margin

Neural Information Processing Systems

We describe a unifying method for proving relative loss bounds for online linearthreshold classification algorithms, such as the Perceptron and the Winnow algorithms. For classification problems the discrete loss is used, i.e., the total number of prediction mistakes. We introduce a continuous lossfunction, called the "linear hinge loss", that can be employed to derive the updates of the algorithms. We first prove bounds w.r.t. the linear hinge loss and then convert them to the discrete loss. We introduce anotion of "average margin" of a set of examples . We show how relative loss bounds based on the linear hinge loss can be converted to relative loss bounds i.t.o. the discrete loss using the average margin.



Planning and Acting Together

AI Magazine

People often act together with a shared purpose; they collaborate. Collaboration enables them to work more efficiently and to complete activities they could not accomplish individually. An increasing number of computer applications also require collaboration among various systems and people. Thus, a major challenge for AI researchers is to determine how to construct computer systems that are able to act effectively as partners in collaborative activity. Collaborative activity entails participants forming commitments to achieve the goals of the group activity and requires group decision making and group planning procedures. In addition, agents must be committed to supporting the activities of their fellow participants in support of the group activity. Furthermore, when conflicts arise (for example, from resource bounds), participants must weigh their commitments to various group activities against those for individual activities. This article briefly reviews the major features of one model of collaborative planning called SHARED-PLANS (Grosz and Kraus 1999, 1996). It describes several current efforts to develop collaborative planning agents and systems for human-computer communication based on this model. Finally, it discusses empirical research aimed at determining effective commitment strategies in the SHAREDPLANS context.


A Survey of Research in Distributed, Continual Planning

AI Magazine

Complex, real-world domains require rethinking traditional approaches to AI planning. Planning and executing the resulting plans in a dynamic environment implies a continual approach in which planning and execution are interleaved, uncertainty in the current and projected world state is recognized and handled appropriately, and replanning can be performed when the situation changes or planned actions fail. Furthermore, complex planning and execution problems may require multiple computational agents and human planners to collaborate on a solution. In this article, we describe a new paradigm for planning in complex, dynamic environments, which we term distributed, continual planning (DCP). We argue that developing DCP systems will be necessary for planning applications to be successful in these environments. We give a historical overview of research leading to the current state of the art in DCP and describe research in distributed and continual planning.