Education
Computational Intractability of Dictionary Learning for Sparse Representation
Razaviyayn, Meisam, Tseng, Hung-Wei, Luo, Zhi-Quan
In this paper we consider the dictionary learning problem for sparse representation. We first show that this problem is NP-hard by polynomial time reduction of the densest cut problem. Then, using successive convex approximation strategies, we propose efficient dictionary learning schemes to solve several practical formulations of this problem to stationary points. Unlike many existing algorithms in the literature, such as K-SVD, our proposed dictionary learning scheme is theoretically guaranteed to converge to the set of stationary points under certain mild assumptions. For the image denoising application, the performance and the efficiency of the proposed dictionary learning scheme are comparable to that of K-SVD algorithm in simulation.
Stop Wasting My Gradients: Practical SVRG
Babanezhad, Reza, Ahmed, Mohamed Osama, Virani, Alim, Schmidt, Mark, Konečný, Jakub, Sallinen, Scott
We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors in the control variate, and use this to derive variants of SVRG that use growing-batch strategies to reduce the number of gradient calculations required in the early iterations. We further (i) show how to exploit support vectors to reduce the number of gradient computations in the later iterations, (ii) prove that the commonly-used regularized SVRG iteration is justified and improves the convergence rate, (iii) consider alternate mini-batch selection strategies, and (iv) consider the generalization error of the method.
Optimal Rates for Random Fourier Features
Sriperumbudur, Bharath K., Szabo, Zoltan
Kernel methods represent one of the most powerful tools in machine learning to tackle problems expressed in terms of function values and derivatives due to their capability to represent and model complex relations. While these methods show good versatility, they are computationally intensive and have poor scalability to large data as they require operations on Gram matrices. In order to mitigate this serious computational limitation, recently randomized constructions have been proposed in the literature, which allow the application of fast linear algorithms. Random Fourier features (RFF) are among the most popular and widely applied constructions: they provide an easily computable, low-dimensional feature representation for shift-invariant kernels. Despite the popularity of RFFs, very little is understood theoretically about their approximation quality. In this paper, we provide a detailed finite-sample theoretical analysis about the approximation quality of RFFs by (i) establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and (ii) presenting guarantees in $L^r$ ($1\le r<\infty$) norms. We also propose an RFF approximation to derivatives of a kernel with a theoretical study on its approximation quality.
Supervised Learning for Dynamical System Learning
Hefny, Ahmed, Downey, Carlton, Gordon, Geoffrey
Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem, we present a new view of dynamical system learning: we show how to learn dynamical systems by solving a sequence of ordinary supervised learning problems, thereby allowing users to incorporate prior knowledge via standard techniques such as L1 regularization. Many existing spectral methods are special cases of this new framework, using linear regression as the supervised learner. We demonstrate the effectiveness of our framework by showing examples where nonlinear regression or lasso let us learn better state representations than plain linear regression does; the correctness of these instances follows directly from our general analysis.
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
This work addresses the problem of regret minimization in non-stochastic multi-armed bandit problems, focusing on performance guarantees that hold with high probability. Such results are rather scarce in the literature since proving them requires a large deal of technical effort and significant modifications to the standard, more intuitive algorithms that come only with guarantees that hold on expectation. One of these modifications is forcing the learner to sample arms from the uniform distribution at least $\Omega(\sqrt{T})$ times over $T$ rounds, which can adversely affect performance if many of the arms are suboptimal. While it is widely conjectured that this property is essential for proving high-probability regret bounds, we show in this paper that it is possible to achieve such strong results without this undesirable exploration component. Our result relies on a simple and intuitive loss-estimation strategy called Implicit eXploration (IX) that allows a remarkably clean analysis. To demonstrate the flexibility of our technique, we derive several improved high-probability bounds for various extensions of the standard multi-armed bandit framework. Finally, we conduct a simple experiment that illustrates the robustness of our implicit exploration technique.
Toward Adversarial Online Learning and the Science of Deceptive Machines
Abramson, Myriam (US Naval Research Laboratory)
Intelligent systems rely on pattern recognition and signature-based approaches for a wide range of sensors enhancing situational awareness. For example, autonomous systems depend on environmental sensors to perform their tasks and secure systems depend on anomaly detection methods. The availability of large amount of data requires the processing of data in a “streaming” fashion with online algorithms. Yet, just as online learning can enhance adaptability to a non-stationary environment, it introduces vulnerabilities that can be manipulated by adversaries to achieve their goals while evading detection. Although human intelligence might have evolved from social interactions, machine intelligence has evolved as a human intelligence artifact and been kept isolated to avoid ethical dilemmas. As our adversaries become sophisticated, it might be time to revisit this question and examine how we can combine online learning and reasoning leading to the science of deceptive and counter-deceptive machines.
Exploring the Use of Role Model Avatars in Educational Games
Kao, Dominic (Massachusetts Institute of Technology) | Harrell, D. Fox (Massachusetts Institute of Technology)
Research has indicated that role models have the potential to boost academic performance. In this paper, we describe an experiment exploring role models as game avatars in an educational game. Of particular interest are the effects of these avatars on players' performance and engagement. Participants were randomly assigned to a condition: a) user selected role model avatar, or b) user selected shape avatar. Results suggest that role models are heavily preferred. African American participants had higher game affect in the role model condition. South Asian participants had higher self-reported engagement in the role model condition. Participants that completed <= 1 levels had higher performance in the role model condition. General trends suggest that the role model's gender and racial closeness with the player, could play a role in player performance and self-reported engagement as consistent with the social science literature.
Exploring Player Trace Segmentation for Dynamic Play Style Prediction
Valls-Vargas, Josep (Drexel University) | Ontañón, Santiago (Drexel University) | Zhu, Jichen (Drexel University)
Existing work on player modeling often assumes that the play style of players is static. However, our recent work shows evidence that players regularly change their play style over time. In this paper we propose a novel player modeling framework to capture this change by using episodic information and sequential machine learning techniques. In particular, we experiment with different trace segmentation strategies for play style prediction. We evaluate this new framework on gameplay data gathered from a game-based interactive learning environment. Our results show that sequential machine learning techniques that incorporate predictions from previous segments outperform non-sequential techniques. Our results also show that too fine (minute-by-minute) or too coarse (whole trace) segmentation of traces decreases performance.
Online Transfer Learning in Reinforcement Learning Domains
Zhan, Yusen (Washington State University) | Taylor, Mattew E. (Washington State University)
This paper proposes an online transfer framework to capture the interaction among agents and shows that current transfer learning in reinforcement learning is a special case of online transfer. Furthermore, this paper re-characterizes existing agents-teaching-agents methods as online transfer and analyze one such teaching method in three ways. First, the convergence of Q-learning and Sarsa with tabular representation with a finite budget is proven. Second, the convergence of Q-learning and Sarsa with linear function approximation is established. Third, the we show the asymptotic performance cannot be hurt through teaching. Additionally, all theoretical results are empirically validated.
Formalizing Deceptive Reasoning in Breaking Bad: Default Reasoning in a Doxastic Logic
Licato, John (Indiana University and Purdue University, Fort Wayne)
The rich expressivity provided by the cognitive event calculus (CEC) knowledge representation framework allows for reasoning over deeply nested beliefs, desires, intentions, and so on. I put CEC to the test by attempting to model the complex reasoning and deceptive planning used in an episode of the popular television show Breaking Bad. CEC is used to represent the knowledge used by reasoners coming up with plans like the ones devised by the fictional characters I describe. However, it becomes clear that a form of nonmonotonic reasoning is necessary—specifically so that an agent can reason about the nonmonotonic beliefs of another agent. I show how CEC can be augmented to have this ability, and then provide examples detailing how my proposed augmentation enables much of the reasoning used by agents such as the Breaking Bad characters. I close by discussing what sort of reasoning tool would be necessary to implement such nonmonotonic reasoning.