Oates, Tim


Adaptive Normalized Risk-Averting Training For Deep Neural Networks

arXiv.org Machine Learning

This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard $L_p$-norm error. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why to learn $\lambda$ adaptively using gradient descent works. In practice, we show how this method improves training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other specific tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization problem in DNNs.


Adaptive Normalized Risk-Averting Training for Deep Neural Networks

AAAI Conferences

This paper proposes a set of new error criteria and a learning approach, called Adaptive Normalized Risk-Averting Training (ANRAT) to attack the non-convex optimization problem in training deep neural networks without pretraining. Theoretically, we demonstrate its effectiveness based on the expansion of the convexity region. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why our learning method using gradient descent works. In practice, we show how this training method is successfully applied for improved training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Using simple experimental settings without pretraining and other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptron and Denoised Auto-encoder is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other common tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization strategy in training DNNs.


Imaging Time-Series to Improve Classification and Imputation

AAAI Conferences

Inspired by recent successes of deep learning in computer vision, we propose a novel framework for encoding time series as different types of images, namely, Gramian Angular Summation/Difference Fields (GASF/GADF) and Markov Transition Fields (MTF). This enables the use of techniques from computer vision for time series classification and imputation. We used Tiled Convolutional Neural Networks (tiled CNNs) on 20 standard datasets to learn high-level features from the individual and compound GASF-GADF-MTF images. Our approaches achieve highly competitive results when compared to nine of the current best time series classification approaches. Inspired by the bijection property of GASF on 0/1 rescaled data, we train Denoised Auto-encoders (DA) on the GASF images of four standard and one synthesized compound dataset. The imputation MSE on test data is reduced by 12.18% – 48.02% when compared to using the raw data. An analysis of the features and weights learned via tiled CNNs and DAs explains why the approaches work.


Imaging Time-Series to Improve Classification and Imputation

arXiv.org Machine Learning

Inspired by recent successes of deep learning in computer vision, we propose a novel framework for encoding time series as different types of images, namely, Gramian Angular Summation/Difference Fields (GASF/GADF) and Markov Transition Fields (MTF). This enables the use of techniques from computer vision for time series classification and imputation. We used Tiled Convolutional Neural Networks (tiled CNNs) on 20 standard datasets to learn high-level features from the individual and compound GASF-GADF-MTF images. Our approaches achieve highly competitive results when compared to nine of the current best time series classification approaches. Inspired by the bijection property of GASF on 0/1 rescaled data, we train Denoised Auto-encoders (DA) on the GASF images of four standard and one synthesized compound dataset. The imputation MSE on test data is reduced by 12.18%-48.02% when compared to using the raw data. An analysis of the features and weights learned via tiled CNNs and DAs explains why the approaches work.


Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks

AAAI Conferences

Inspired by recent successes of deep learning in computer vision and speech recognition, we propose a novel framework to encode time series data as different types of images, namely, Gramian Angular Fields (GAF) and Markov Transition Fields (MTF). This enables the use of techniques from computer vision for classification. Using a polar coordinate system, GAF images are represented as a Gramian matrix where each element is the trigonometric sum (i.e., superposition of directions) between different time intervals. MTF images represent the first order Markov transition probability along one dimension and temporal dependency along the other. We used Tiled Convolutional Neural Networks (tiled CNNs) on 12 standard datasets to learn high-level features from individual GAF, MTF, and GAF-MTF images that resulted from combining GAF and MTF representations into a single image. The classification results of our approach are competitive with five stateof-the-art approaches. An analysis of the features and weights learned via tiled CNNs explains why the approach works.


The Thing That We Tried Didn't Work Very Well : Deictic Representation in Reinforcement Learning

arXiv.org Artificial Intelligence

Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation is believed to be a viable alternative: they promise generalization while allowing the use of existing reinforcement-learning methods. Yet, there are few experiments on learning with deictic representations reported in the literature. In this paper we explore the effectiveness of two forms of deictic representation and a na\"{i}ve propositional representation in a simple blocks-world domain. We find, empirically, that the deictic representations actually worsen learning performance. We conclude with a discussion of possible causes of these results and strategies for more effective learning in domains with objects.


Toward an Integrated Metacognitive Architecture

AAAI Conferences

Researchers have studied problems in metacognition both in computers and in humans. In response some have implemented models of cognition and metacognitive activity in various architectures to test and better define specific theories of metacognition. However, current theories and implementations suffer from numerous problems and lack of detail. Here we illustrate the problems with two different computational approaches. The Meta-Cognitive Loop and Meta-AQUA both examine the metacognitive reasoning involved in monitoring and reasoning about failures of expectations, and they both learn from such experiences. But neither system presents a full accounting of the variety of known metacognitive phenomena, and, as far as we know, no extant system does. The problem is that no existing cognitive architecture directly addresses metacognition. Instead, current architectures were initially developed to study more narrow cognitive functions and only later were they modified to include higher level attributes. We claim that the solution is to develop a metacognitive architecture outright, and we begin to outline the structure that such a foundation might have.


Model AI Assignments 2011

AAAI Conferences

The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning experience, we here present abstracts of three AI assignments from the 2011 session that are easily adoptable, playfully engaging, and flexible for a variety of instructor needs.


Reports of the AAAI 2009 Spring Symposia

AI Magazine

The titles of the nine symposia were Agents that Learn from Human Teachers, Benchmarking of Qualitative Spatial and Temporal Reasoning Systems, Experimental Design for Real-World Systems, Human Behavior Modeling, Intelligent Event Processing, Intelligent Narrative Technologies II, Learning by Reading and Learning to Read, Social Semantic Web: Where Web 2.0 Meets Web 3.0, and Technosocial Predictive Analytics. The aim of the Benchmarking of Qualitative Spatial and Temporal Reasoning Systems symposium was to initiate the development of a problem repository in the field of qualitative spatial and temporal reasoning and identify a graded set of challenges for future midterm and long-term research. The Intelligent Event Processing symposium discussed the need for more AI-based approaches in event processing and defined a kind of research agenda for the field, coined as intelligent complex event processing (iCEP). The Intelligent Narrative Technologies II AAAI symposium discussed innovations, progress, and novel techniques in the research domain.


Reports of the AAAI 2009 Spring Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, was pleased to present the 2009 Spring Symposium Series, held Monday through Wednesday, March 23–25, 2009 at Stanford University. The titles of the nine symposia were Agents that Learn from Human Teachers, Benchmarking of Qualitative Spatial and Temporal Reasoning Systems, Experimental Design for Real-World Systems, Human Behavior Modeling, Intelligent Event Processing, Intelligent Narrative Technologies II, Learning by Reading and Learning to Read, Social Semantic Web: Where Web 2.0 Meets Web 3.0, and Technosocial Predictive Analytics. The goal of the Agents that Learn from Human Teachers was to investigate how we can enable software and robotics agents to learn from real-time interaction with an everyday human partner. The aim of the Benchmarking of Qualitative Spatial and Temporal Reasoning Systems symposium was to initiate the development of a problem repository in the field of qualitative spatial and temporal reasoning and identify a graded set of challenges for future midterm and long-term research. The Experimental Design symposium discussed the challenges of evaluating AI systems. The Human Behavior Modeling symposium explored reasoning methods for understanding various aspects of human behavior, especially in the context of designing intelligent systems that interact with humans. The Intelligent Event Processing symposium discussed the need for more AI-based approaches in event processing and defined a kind of research agenda for the field, coined as intelligent complex event processing (iCEP). The Intelligent Narrative Technologies II AAAI symposium discussed innovations, progress, and novel techniques in the research domain. The Learning by Reading and Learning to Read symposium explored two aspects of making natural language texts semantically accessible to, and processable by, machines. The Social Semantic Web symposium focused on the real-world grand challenges in this area. Finally, the Technosocial Predictive Analytics symposium explored new methods for anticipatory analytical thinking that provide decision advantage through the integration of human and physical models.