Education
AAAI News
Hamilton, Carol M. (Association for the Advancement of Artificial Intelligence)
The Doctoral Consortium materials; a workshop for of ideas between basic and applied AI. (DC) provides an opportunity for a mentoring new faculty, instructors, IAAI-11 will consider papers in two group of Ph.D. students to discuss and and graduate students on teaching; an tracks: (1) deployed application case explore their research interests and career Educational Video Track within the studies and (2) emerging applications objectives with a panel of established AAAI-11 Video program; and a Student/Educator or methodologies.
Reports of the AAAI 2010 Conference Workshops
Aha, David W. (Naval Research Laboratory) | Boddy, Mark (Adventium Labs) | Bulitko, Vadim (University of Alberta) | Garcez, Artur S. d' (City University London) | Avila (University of Georgia) | Doshi, Prashant (TZI, Bremen University) | Edelkamp, Stefan (University of Edinburgh) | Geib, Christopher (University of Illinois, Chicago) | Gmytrasiewicz, Piotr (Smart Information Flow Technologies) | Goldman, Robert P. (Wright State University) | Hitzler, Pascal (Georgia Institute of Technology) | Isbell, Charles (University of Maryland, College Park) | Josyula, Darsana (Massachusetts Institute of Technology) | Kaelbling, Leslie Pack (University of Bonn) | Kersting, Kristian (Georgia Institute of Technology) | Kunda, Maithilee (Universidade Federal do Rio Grande do Sul (UFRGS)) | Lamb, Luis C. (Willow Garage) | Marthi, Bhaskara (Georgia Institute of Technology) | McGreggor, Keith (EML Research gGmbH) | Nastase, Vivi (University College Cork) | Provan, Gregory (University of North Carolina, Charlotte) | Raja, Anita (Georgia Institute of Technology) | Ram, Ashwin (Georgia Institute of Technology) | Riedl, Mark (University of California, Berkeley) | Russell, Stuart (Cornell University) | Sabharwal, Ashish (University of Freiburg) | Smaus, Jan-Georg (University of Central Florida) | Sukthankar, Gita (Maastricht University) | Tuyls, Karl (University of New South Wales) | Meyden, Ron van der (Google, Inc.) | Halevy, Alon (University of Maryland) | Mihalkova, Lilyana (University of Wisconsin) | Natarajan, Sriraam
The AAAI-10 Workshop program was held Sunday and Monday, July 11–12, 2010 at the Westin Peachtree Plaza in Atlanta, Georgia. The AAAI-10 workshop program included 13 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were AI and Fun, Bridging the Gap between Task and Motion Planning, Collaboratively-Built Knowledge Sources and Artificial Intelligence, Goal-Directed Autonomy, Intelligent Security, Interactive Decision Theory and Game Theory, Metacognition for Robust Social Systems, Model Checking and Artificial Intelligence, Neural-Symbolic Learning and Reasoning, Plan, Activity, and Intent Recognition, Statistical Relational AI, Visual Representations and Reasoning, and Abstraction, Reformulation, and Approximation. This article presents short summaries of those events.
Autoregressive Kernels For Time Series
We propose in this work a new family of kernels for variable-length time series. Our work builds upon the vector autoregressive (VAR) model for multivariate stochastic processes: given a multivariate time series x, we consider the likelihood function p_{\theta}(x) of different parameters \theta in the VAR model as features to describe x. To compare two time series x and x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among other properties, this kernel can be easily computed when the dimension d of the time series is much larger than the lengths of the considered time series x and x'. It can also be generalized to time series taking values in arbitrary state spaces, as long as the state space itself is endowed with a kernel \kappa. In that case, the kernel between x and x' is a a function of the Gram matrices produced by \kappa on observations and subsequences of observations enumerated in x and x'. We describe a computationally efficient implementation of this generalization that uses low-rank matrix factorization techniques. These kernels are compared to other known kernels using a set of benchmark classification tasks carried out with support vector machines.
Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories
Konidaris, George, Kuindersma, Scott, Grupen, Roderic, Barto, Andrew G.
We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropriate abstraction, or that a segment is too complex to model as a single skill. The skill chains from each trajectory are then merged to form a skill tree. We demonstrate that CST constructs an appropriate skill tree that can be further refined through learning in a challenging continuous domain, and that it can be used to segment demonstration trajectories on a mobile manipulator into chains of skills where each skill is assigned an appropriate abstraction.
Online Learning: Random Averages, Combinatorial Parameters, and Learnability
Rakhlin, Alexander, Sridharan, Karthik, Tewari, Ambuj
We develop a theory of online learning by defining several complexity measures. Among them are analogues of Rademacher complexity, covering numbers and fat-shattering dimension from statistical learning theory. Relationship among these complexity measures, their connection to online learning, and tools for bounding them are provided. We apply these results to various learning problems. We provide a complete characterization of online learnability in the supervised setting.
Improving Human Judgments by Decontaminating Sequential Dependencies
Mozer, Michael C., Pashler, Harold, Wilder, Matthew, Lindsey, Robert V., Jones, Matt, Jones, Michael N.
For over half a century, psychologists have been struck by how poor people are at expressing their internal sensations, impressions, and evaluations via rating scales. When individuals make judgments, they are incapable of using an absolute rating scale, and instead rely on reference points from recent experience. This relativity of judgment limits the usefulness of responses provided by individuals to surveys, questionnaires, and evaluation forms. Fortunately, the cognitive processes that transform internal states to responses are not simply noisy, but rather are influenced by recent experience in a lawful manner. We explore techniques to remove sequential dependencies, and thereby decontaminate a series of ratings to obtain more meaningful human judgments. In our formulation, decontamination is fundamentally a problem of inferring latent states (internal sensations) which, because of the relativity of judgment, have temporal dependencies. We propose a decontamination solution using a conditional random field with constraints motivated by psychological theories of relative judgment. Our exploration of decontamination models is supported by two experiments we conducted to obtain ground-truth rating data on a simple length estimation task. Our decontamination techniques yield an over 20% reduction in the error of human judgments.
PAC-Bayesian Model Selection for Reinforcement Learning
Fard, Mahdi M., Pineau, Joelle
This paper introduces the first set of PAC-Bayesian bounds for the batch reinforcement learning problem in finite state spaces. These bounds hold regardless of the correctness of the prior distribution. We demonstrate how such bounds can be used for model-selection in control problems where prior information is available either on the dynamics of the environment, or on the value of actions. Our empirical results confirm that PAC-Bayesian model-selection is able to leverage prior distributions when they are informative and, unlike standard Bayesian RL approaches, ignores them when they are misleading.
Factorized Latent Spaces with Structured Sparsity
Jia, Yangqing, Salzmann, Mathieu, Darrell, Trevor
Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multi-view learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow :having latent dimensions shared between any subset of the views instead of between all the views only. We show that our approach outperforms state-of-the-art methods on the task of human pose estimation.
New Adaptive Algorithms for Online Classification
Orabona, Francesco, Crammer, Koby
We propose a general framework to online learning for classification problems with time-varying potential functions in the adversarial setting. This framework allows to design and prove relative mistake bounds for any generic loss function. The mistake bounds can be specialized for the hinge loss, allowing to recover and improve the bounds of known online classification algorithms. By optimizing the general bound we derive a new online classification algorithm, called NAROW, that hybridly uses adaptive- and fixed- second order information. We analyze the properties of the algorithm and illustrate its performance using synthetic dataset.
Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains
The reinforcement learning community has explored many approaches to obtain- ing value estimates and models to guide decision making; these approaches, how- ever, do not usually provide a measure of confidence in the estimate. Accurate estimates of an agent’s confidence are useful for many applications, such as bi- asing exploration and automatically adjusting parameters to reduce dependence on parameter-tuning. Computing confidence intervals on reinforcement learning value estimates, however, is challenging because data generated by the agent- environment interaction rarely satisfies traditional assumptions. Samples of value- estimates are dependent, likely non-normally distributed and often limited, partic- ularly in early learning when confidence estimates are pivotal. In this work, we investigate how to compute robust confidences for value estimates in continuous Markov decision processes. We illustrate how to use bootstrapping to compute confidence intervals online under a changing policy (previously not possible) and prove validity under a few reasonable assumptions. We demonstrate the applica- bility of our confidence estimation algorithms with experiments on exploration, parameter estimation and tracking.