Goto

Collaborating Authors

 Genre


Context Tree Maximizing

AAAI Conferences

Recent developments in reinforcement learning for non-Markovianproblems witness a surge in history-based methods, among which weare particularly interested in two frameworks, PhiMDP and MC-AIXI-CTW. PhiMDP attempts to reduce the general RL problem, where the environment's states and dynamics are both unknown, toan MDP, while MC-AIXI-CTW incrementally learns a mixture of contexttrees as its environment model. The main idea of PhiMDP is toconnect generic reinforcement learning with classical reinforcementlearning. The first implementation of PhiMDP relies on astochastic search procedure for finding a tree that minimizes acertain cost function. This does not guarantee finding theminimizing tree, or even a good one, given limited search time. As aconsequence it appears that the approach has difficulties with largedomains. MC-AIXI-CTW is attractive in that it can incrementally andanalytically compute the internal model through interactions withthe environment. Unfortunately, it is computationally demanding dueto requiring heavy planning simulations at every single time step.We devise a novel approach called CTMRL, which analytically andefficiently finds the cost-minimizing tree. Instead of thecontext-tree weighting method that MC-AIXI-CTW is based on, we usethe closely related context-tree maximizing algorithm that selectsjust one single tree. This approach falls under the PhiMDPframework, which allows the replacement of the costly planningcomponent of MC-AIXI-CTW with simple Q-Learning. Our empiricalinvestigation show that CTMRL finds policies of quality as good as MC-AIXI-CTW's on sixdomains including a challenging Pacman domain, but in an order ofmagnitude less time.


Rule Ensemble Learning Using Hierarchical Kernels in Structured Output Spaces

AAAI Conferences

The goal in Rule Ensemble Learning (REL) is simultaneous discovery of a small set of simple rules and their optimal weights that lead to good generalization. Rules are assumed to be conjunctions of basic propositions concerning the values taken by the input features. It has been shown that rule ensembles for classification can be learnt optimally and efficiently using hierarchical kernel learning approaches that explore the exponentially large space of conjunctions by exploiting its hierarchical structure. The regularizer employed penalizes large features and thereby selects a small set of short features. In this paper, we generalize the rule ensemble learning using hierarchical kernels (RELHKL) framework to multi class structured output spaces. We build on the StructSVM model for sequence prediction problems and employ a ρ-norm hierarchical regularizer for observation features and a conventional 2-norm regularizer for state transition features. The exponentially large feature space is searched using an active set algorithm and the exponentially large set of constraints are handled using a cutting plane algorithm. The approach can be easily extended to other structured output problems. We perform experiments on activity recognition datasets which are prone to noise, sparseness and skewness. We demonstrate that our approach outperforms other approaches.


Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition

AAAI Conferences

This paper presents the design and learning architecture for an omnidirectional walk used by a humanoid robot soccer agent acting in the RoboCup 3D simulation environment. The walk, which was originally designed for and tested on an actual Nao robot before being employed in the 2011 RoboCup 3D simulation competition, was the crucial component in the UT Austin Villa team winning the competition in 2011. To the best of our knowledge, this is the first time that robot behavior has been conceived and constructed on a real robot for the end purpose of being used in simulation.  The walk is based on a double linear inverted pendulum model, and multiple sets of its parameters are optimized via a novel framework. The framework optimizes parameters for different tasks in conjunction with one another, a little-understood problem with substantial practical significance.  Detailed experiments show that the UT Austin Villa agent significantly outperforms all the other agents in the competition with the optimized walk being the key to its success.


Transfer Learning with Graph Co-Regularization

AAAI Conferences

Transfer learning proves to be effective for leveraging labeled data in the source domain to build an accurate classifier in the target domain. The basic assumption behind transfer learning is that the involved domains share some common latent factors. Previous methods usually explore these latent factors by optimizing two separate objective functions, i.e., either maximizing the empirical likelihood, or preserving the geometric structure. Actually, these two objective functions are complementary to each other and optimizing them simultaneously can make the solution smoother and further improve the accuracy of the final model. In this paper, we propose a novel approach called Graph co-regularized Transfer Learning (GTL) for this purpose, which integrates the two objective functions seamlessly into one unified optimization problem. Thereafter, we present an iterative algorithm for the optimization problem with rigorous analysis on convergence and complexity. Our empirical study on two open data sets validates that GTL can consistently improve the classification accuracy compared to the state-of-the-art transfer learning methods.


Towards Discovering What Patterns Trigger What Labels

AAAI Conferences

In many real applications, especially those involving data objects with complicated semantics, it is generally desirable to discover the relation between patterns in the input space and labels corresponding to different semantics in the output space. This task becomes feasible with MIML (Multi-Instance Multi-Label learning), a recently developed learning framework, where each data object is represented by multiple instances and is allowed to be associated with multiple labels simultaneously. In this paper, we propose KISAR , an MIML algorithm that is able to discover what instances trigger what labels. By considering the fact that highly relevant labels usually share some patterns, we develop a convex optimization formulation and provide an alternating optimization solution. Experiments show that KISAR is able to discover reasonable relations between input patterns and output labels, and achieves performances that are highly competitive with many state-of-the-art MIML algorithms.


Sparse Probabilistic Relational Projection

AAAI Conferences

Probabilistic relational PCA (PRPCA) can learn a projection matrix to perform dimensionality reduction for relational data. However, the results learned by PRPCA lack interpretability because each principal component is a linear combination of all the original variables. In this paper, we propose a novel model, called sparse probabilistic relational projection (SPRP), to learn a sparse projection matrix for relational dimensionality reduction. The sparsity in SPRP is achieved by imposing on the projection matrix a sparsity-inducing prior such as the Laplace prior or Jeffreys prior. We propose an expectation-maximization (EM) algorithm to learn the parameters of SPRP. Compared with PRPCA, the sparsity in SPRP not only makes the results more interpretable but also makes the projection operation much more efficient without compromising its accuracy. All these are verified by experiments conducted on several real applications.


Topic Correlation Analysis for Cross-Domain Text Classification

AAAI Conferences

Cross-domain text classification aims to automatically train a precise text classifier for a target domain by using labeled text data from a related source domain. To this end, the distribution gap between different domains has to be reduced. In previous works, a certain number of shared latent features (e.g., latent topics, principal components, etc.) are extracted to represent documents from different domains, and thus reduce the distribution gap. However, only relying the shared latent features as the domain bridge may limit the amount of knowledge transferred. This limitation is more serious when the distribution gap is so large that only a small number of latent features can be shared between domains. In this paper, we propose a novel approach named Topic Correlation Analysis (TCA), which extracts both the shared and the domain-specific latent features to facilitate effective knowledge transfer. In TCA, all word features are first grouped into the shared and the domain-specific topics using a joint mixture model. Then the correlations between the two kinds of topics are inferred and used to induce a mapping between the domain-specific topics from different domains. Finally, both the shared and the mapped domain-specific topics are utilized to span a new shared feature space where the supervised knowledge can be effectively transferred. The experimental results on two real-world data sets justify the superiority of the proposed method over the stat-of-the-art baselines.


Teaching Machines to Learn by Metaphors

AAAI Conferences

Humans have an uncanny ability to learn new concepts with very few examples. Cognitive theories have suggested that this is done by utilizing prior experience of related tasks. We propose to emulate this process in machines, by transforming new problems into old ones. These transformations are called metaphors. Obviously, the learner is not given a metaphor, but must acquire one through a learning process. We show that learning metaphors yield better results than existing transfer learning methods. Moreover, we argue that metaphors give a qualitative assessment of task relatedness.


Learning the Kernel Matrix with Low-Rank Multiplicative Shaping

AAAI Conferences

Selecting the optimal kernel is an important and difficult challenge in applying kernel methods to pattern recognition. To address this challenge, multiple kernel learning (MKL) aims to learn a kernel from a combination of base kernel functions that perform optimally on the task. In this paper, we propose a novel MKL-themed approach to combine base kernels that are multiplicatively shaped with low-rank positive semidefinitve matrices. The proposed approach generalizes several popular MKL methods and thus provides more flexibility in modeling data. Computationally, we show how these low-rank matrices can be learned efficiently from data using convex quadratic programming. Empirical studies on several standard benchmark datasets for MKL show that the new approach often improves prediction accuracy statistically significantly over very competitive single kernel and other MKL methods.


Kernel-Based Reinforcement Learning on Representative States

AAAI Conferences

Markov decision processes (MDPs) are an established framework for solving sequential decision-making problems under uncertainty. In this work, we propose a new method for batch-mode reinforcement learning (RL) with continuous state variables. The method is an approximation to kernel-based RL on a set of k representative states. Similarly to kernel-based RL, our solution is a fixed point of a kernelized Bellman operator and can approximate the optimal solution to an arbitrary level of granularity. Unlike kernel-based RL, our method is fast. In particular, our policies can be computed in O ( n ) time, where n is the number of training examples. The time complexity of kernel-based RL is Ω( n 2 ). We introduce our method, analyze its convergence, and compare it to existing work. The method is evaluated on two existing control problems with 2 to 4 continuous variables and a new problem with 64 variables. In all cases, we outperform state-of-the-art results and offer simpler solutions.