Country
Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds for the algorithm's online performance after some finite number of steps. In the spirit of similar methods already successfully applied for the exploration-exploitation tradeoff in multi-armed bandit problems, we use upper confidence bounds to show that our UCRL algorithm achieves logarithmic online regret in the number of steps taken with respect to an optimal policy.
Multi-Task Feature Learning
Argyriou, Andreas, Evgeniou, Theodoros, Pontil, Massimiliano
We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the wellknown 1-normregularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that this problem is equivalent to a convex optimization problem and develop an iterative algorithm for solving it. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the latter step we learn commonacross-tasks representationsand in the former step we learn task-specific functions using these representations. We report experiments on a simulated and a real data set which demonstrate that the proposed method dramatically improves the performance relativeto learning each task independently. Our algorithm can also be used, as a special case, to simply select - not learn - a few common features across the tasks.
Sparse Kernel Orthonormalized PLS for feature extraction in large data sets
Arenas-garcรญa, Jerรณnimo, Petersen, Kaare B., Hansen, Lars K.
In this paper we are presenting a novel multivariate analysis method. Our scheme is based on a novel kernel orthonormalized partial least squares (PLS) variant for feature extraction, imposing sparsity constrains in the solution to improve scalability. Thealgorithm is tested on a benchmark of UCI data sets, and on the analysis of integrated short-time music features for genre prediction. The upshot is that the method has strong expressive power even with rather few features, is clearly outperforming the ordinary kernel PLS, and therefore is an appealing method for feature extraction of labelled data.
Learning on Graph with Laplacian Regularization
We consider a general form of transductive learning on graphs with Laplacian regularization, and derive margin-based generalization bounds using appropriate geometric properties of the graph. We use this analysis to obtain a better understanding ofthe role of normalization of the graph Laplacian matrix as well as the effect of dimension reduction. The results suggest a limitation of the standard degree-based normalization. We propose a remedy from our analysis and demonstrate empiricallythat the remedy leads to improved classification performance.
Online Classification for Complex Problems Using Simultaneous Projections
Amit, Yonatan, Shalev-shwartz, Shai, Singer, Yoram
We describe and analyze an algorithmic framework for online classification where each online trial consists of multiple prediction tasks that are tied together. We tackle the problem of updating the online hypothesis by defining a projection problem in which each prediction task corresponds to a single linear constraint. These constraints are tied together through a single slack parameter. We then introduce ageneral method for approximately solving the problem by projecting simultaneously and independently on each constraint which corresponds to a prediction sub-problem,and then averaging the individual solutions. We show that this approach constitutes a feasible, albeit not necessarily optimal, solution for the original projection problem. We derive concrete simultaneous projection schemes and analyze them in the mistake bound model. We demonstrate the power of the proposed algorithm in experiments with online multiclass text categorization. Our experiments indicate that a combination of class-dependent features with the simultaneous projection method outperforms previously studied algorithms.
Tighter PAC-Bayes Bounds
Ambroladze, Amiran, Parrado-hernรกndez, Emilio, Shawe-taylor, John S.
This paper proposes a PAC-Bayes bound to measure the performance of Support Vector Machine (SVM) classifiers. The bound is based on learning a prior over the distribution of classifiers with a part of the training samples. Experimental work shows that this bound is tighter than the original PAC-Bayes, resulting in an enhancement of the predictive capabilities of the PAC-Bayes bound. In addition, it is shown that the use of this bound as a means to estimate the hyperparameters of the classifier compares favourably with cross validation in terms of accuracy of the model, while saving a lot of computational burden.
Intelligent Autonomous Robotics: A Robot Soccer Case Study
Robotics technology has recently advanced to the point of being widely accessible for relatively low-budget research, as well as for graduate, undergraduate, and even secondary and primary school education. This lecture provides an example of how to productively use a cutting-edge advanced robotics platform for education and research by providing a detailed case study with the Sony AIBO robot, a vision-based legged robot. The case study used for this lecture is the UT Austin Villa RoboCup Four-Legged Team. This lecture describes both the development process and the technical details of its end result. The main contributions of this lecture are (i) a roadmap for new classes and research groups interested in intelligent autonomous robotics who are starting from scratch with a new robot, and (ii) documentation of the algorithms behind our own approach on the AIBOs with the goal of making them accessible for use on other vision-based and/or legged robot platforms.
A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence
Multiagent systems is an expanding field that blends classical fields like game theory and decentralized control with modern fields like computer science and machine learning. This monograph provides a concise introduction to the subject, covering the theoretical foundations as well as more recent developments in a coherent and readable manner. The text is centered on the concept of an agent as decision maker. Chapter 1 is a short introduction to the field of multiagent systems. Chapter 2 covers the basic theory of singleagent decision making under uncertainty.