Goto

Collaborating Authors

 Education


Computational Intractability of Dictionary Learning for Sparse Representation

arXiv.org Machine Learning

In this paper we consider the dictionary learning problem for sparse representation. We first show that this problem is NP-hard by polynomial time reduction of the densest cut problem. Then, using successive convex approximation strategies, we propose efficient dictionary learning schemes to solve several practical formulations of this problem to stationary points. Unlike many existing algorithms in the literature, such as K-SVD, our proposed dictionary learning scheme is theoretically guaranteed to converge to the set of stationary points under certain mild assumptions. For the image denoising application, the performance and the efficiency of the proposed dictionary learning scheme are comparable to that of K-SVD algorithm in simulation.


Stop Wasting My Gradients: Practical SVRG

arXiv.org Machine Learning

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors in the control variate, and use this to derive variants of SVRG that use growing-batch strategies to reduce the number of gradient calculations required in the early iterations. We further (i) show how to exploit support vectors to reduce the number of gradient computations in the later iterations, (ii) prove that the commonly-used regularized SVRG iteration is justified and improves the convergence rate, (iii) consider alternate mini-batch selection strategies, and (iv) consider the generalization error of the method.


Optimal Rates for Random Fourier Features

arXiv.org Machine Learning

Kernel methods represent one of the most powerful tools in machine learning to tackle problems expressed in terms of function values and derivatives due to their capability to represent and model complex relations. While these methods show good versatility, they are computationally intensive and have poor scalability to large data as they require operations on Gram matrices. In order to mitigate this serious computational limitation, recently randomized constructions have been proposed in the literature, which allow the application of fast linear algorithms. Random Fourier features (RFF) are among the most popular and widely applied constructions: they provide an easily computable, low-dimensional feature representation for shift-invariant kernels. Despite the popularity of RFFs, very little is understood theoretically about their approximation quality. In this paper, we provide a detailed finite-sample theoretical analysis about the approximation quality of RFFs by (i) establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and (ii) presenting guarantees in $L^r$ ($1\le r<\infty$) norms. We also propose an RFF approximation to derivatives of a kernel with a theoretical study on its approximation quality.


Supervised Learning for Dynamical System Learning

arXiv.org Machine Learning

Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem, we present a new view of dynamical system learning: we show how to learn dynamical systems by solving a sequence of ordinary supervised learning problems, thereby allowing users to incorporate prior knowledge via standard techniques such as L1 regularization. Many existing spectral methods are special cases of this new framework, using linear regression as the supervised learner. We demonstrate the effectiveness of our framework by showing examples where nonlinear regression or lasso let us learn better state representations than plain linear regression does; the correctness of these instances follows directly from our general analysis.


Explore no more: Improved high-probability regret bounds for non-stochastic bandits

arXiv.org Machine Learning

This work addresses the problem of regret minimization in non-stochastic multi-armed bandit problems, focusing on performance guarantees that hold with high probability. Such results are rather scarce in the literature since proving them requires a large deal of technical effort and significant modifications to the standard, more intuitive algorithms that come only with guarantees that hold on expectation. One of these modifications is forcing the learner to sample arms from the uniform distribution at least $\Omega(\sqrt{T})$ times over $T$ rounds, which can adversely affect performance if many of the arms are suboptimal. While it is widely conjectured that this property is essential for proving high-probability regret bounds, we show in this paper that it is possible to achieve such strong results without this undesirable exploration component. Our result relies on a simple and intuitive loss-estimation strategy called Implicit eXploration (IX) that allows a remarkably clean analysis. To demonstrate the flexibility of our technique, we derive several improved high-probability bounds for various extensions of the standard multi-armed bandit framework. Finally, we conduct a simple experiment that illustrates the robustness of our implicit exploration technique.


Toward Adversarial Online Learning and the Science of Deceptive Machines

AAAI Conferences

Intelligent systems rely on pattern recognition and signature-based approaches for a wide range of sensors enhancing situational awareness. For example, autonomous systems depend on environmental sensors to perform their tasks and secure systems depend on anomaly detection methods. The availability of large amount of data requires the processing of data in a “streaming” fashion with online algorithms. Yet, just as online learning can enhance adaptability to a non-stationary environment, it introduces vulnerabilities that can be manipulated by adversaries to achieve their goals while evading detection. Although human intelligence might have evolved from social interactions, machine intelligence has evolved as a human intelligence artifact and been kept isolated to avoid ethical dilemmas. As our adversaries become sophisticated, it might be time to revisit this question and examine how we can combine online learning and reasoning leading to the science of deceptive and counter-deceptive machines.


Exploring the Use of Role Model Avatars in Educational Games

AAAI Conferences

Research has indicated that role models have the potential to boost academic performance. In this paper, we describe an experiment exploring role models as game avatars in an educational game. Of particular interest are the effects of these avatars on players' performance and engagement. Participants were randomly assigned to a condition: a) user selected role model avatar, or b) user selected shape avatar. Results suggest that role models are heavily preferred. African American participants had higher game affect in the role model condition. South Asian participants had higher self-reported engagement in the role model condition. Participants that completed <= 1 levels had higher performance in the role model condition. General trends suggest that the role model's gender and racial closeness with the player, could play a role in player performance and self-reported engagement as consistent with the social science literature.


Exploring Player Trace Segmentation for Dynamic Play Style Prediction

AAAI Conferences

Existing work on player modeling often assumes that the play style of players is static. However, our recent work shows evidence that players regularly change their play style over time. In this paper we propose a novel player modeling framework to capture this change by using episodic information and sequential machine learning techniques. In particular, we experiment with different trace segmentation strategies for play style prediction. We evaluate this new framework on gameplay data gathered from a game-based interactive learning environment. Our results show that sequential machine learning techniques that incorporate predictions from previous segments outperform non-sequential techniques. Our results also show that too fine (minute-by-minute) or too coarse (whole trace) segmentation of traces decreases performance.


Online Transfer Learning in Reinforcement Learning Domains

AAAI Conferences

This paper proposes an online transfer framework to capture the interaction among agents and shows that current transfer learning in reinforcement learning is a special case of online transfer. Furthermore, this paper re-characterizes existing agents-teaching-agents methods as online transfer and analyze one such teaching method in three ways. First, the convergence of Q-learning and Sarsa with tabular representation with a finite budget is proven. Second, the convergence of Q-learning and Sarsa with linear function approximation is established. Third, the we show the asymptotic performance cannot be hurt through teaching. Additionally, all theoretical results are empirically validated.


Formalizing Deceptive Reasoning in Breaking Bad: Default Reasoning in a Doxastic Logic

AAAI Conferences

The rich expressivity provided by the cognitive event calculus (CEC) knowledge representation framework allows for reasoning over deeply nested beliefs, desires, intentions, and so on. I put CEC to the test by attempting to model the complex reasoning and deceptive planning used in an episode of the popular television show Breaking Bad. CEC is used to represent the knowledge used by reasoners coming up with plans like the ones devised by the fictional characters I describe. However, it becomes clear that a form of nonmonotonic reasoning is necessary—specifically so that an agent can reason about the nonmonotonic beliefs of another agent. I show how CEC can be augmented to have this ability, and then provide examples detailing how my proposed augmentation enables much of the reasoning used by agents such as the Breaking Bad characters. I close by discussing what sort of reasoning tool would be necessary to implement such nonmonotonic reasoning.