Kurup, Unmesh
Evolving GANs: When Contradictions Turn into Compliance
Dhar, Sauptik, Heydari, Javad, Tripathi, Samarth, Kurup, Unmesh, Shah, Mohak
Limited availability of labeled-data makes any supervised learning problem challenging. Alternative learning settings like semi-supervised and universum learning alleviate the dependency on labeled data, but still require a large amount of unlabeled data, which may be unavailable or expensive to acquire. GAN-based synthetic data generation methods have recently shown promise by generating synthetic samples to improve task at hand. However, these samples cannot be used for other purposes. In this paper, we propose a GAN game which provides improved discriminator accuracy under limited data settings, while generating realistic synthetic data. This provides the added advantage that now the generated data can be used for other similar tasks. We provide the theoretical guarantees and empirical results in support of our approach.
Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization
Dhar, Sauptik, Kurup, Unmesh, Shah, Mohak
This research proposes to use the Moreau-Yosida envelope to stabilize the convergence behavior of bi-level Hyperparameter optimization solvers, and introduces the new algorithm called Moreau-Yosida regularized Hyperparameter Optimization (MY-HPO) algorithm. Theoretical analysis on the correctness of the MY-HPO solution and initial convergence analysis is also provided. Our empirical results show significant improvement in loss values for a fixed computation budget, compared to the state-of-art bi-level HPO solvers.
Pruning Algorithms to Accelerate Convolutional Neural Networks for Edge Applications: A Survey
Liu, Jiayi, Tripathi, Samarth, Kurup, Unmesh, Shah, Mohak
With the general trend of increasing Convolutional Neural Network (CNN) model sizes, model compression and acceleration techniques have become critical for the deployment of these models on edge devices. In this paper, we provide a comprehensive survey on Pruning, a major compression strategy that removes non-critical or redundant neurons from a CNN model. The survey covers the overarching motivation for pruning, different strategies and criteria, their advantages and drawbacks, along with a compilation of major pruning techniques. We conclude the survey with a discussion on alternatives to pruning and current challenges for the model compression community.
Robust Neural Network Training using Periodic Sampling over Model Weights
Tripathi, Samarth, Liu, Jiayi, Kurup, Unmesh, Shah, Mohak
Deep neural networks provide best-in-class performance for a number of computer vision problems. However, training these networks is computationally intensive and requires fine-tuning various hyperparameters. In addition, performance swings widely as the network converges making it hard to decide when to stop training. In this paper, we introduce a trio of techniques (PSWA, PWALKS, and PSWM) centered around periodic sampling of model weights that provide consistent and more robust convergence on a variety of vision problems (classification, detection, segmentation) and gradient update methods (vanilla SGD, Momentum, Adam) with marginal additional computation time. Our techniques use existing optimal training policies but converge in a less volatile fashion with performance improvements that are approximately monotonic. Our analysis of the loss surface shows that these techniques also produce minima that are deeper and wider than those found by SGD.
Is it Safe to Drive? An Overview of Factors, Challenges, and Datasets for Driveability Assessment in Autonomous Driving
Guo, Junyao, Kurup, Unmesh, Shah, Mohak
Is it Safe to Drive? Abstract--With recent advances in learning algorithms and hardware development, autonomous cars have shown promise when operating in structured environments under good driving conditions. However, for complex, cluttered and unseen environments withhigh uncertainty, autonomous driving systems still frequently demonstrate erroneous or unexpected behaviors, that could lead to catastrophic outcomes. Autonomous vehicles should ideally adapt to driving conditions; while this can be achieved through multiple routes, it would be beneficial as a first step to be able to characterize Driveability in some quantified form. To this end, this paper aims to create a framework for investigating different factors that can impact driveability. Also, one of the main mechanisms to adapt autonomous driving systems to any driving condition is to be able to learn and generalize from representative scenarios. The machine learning algorithms that currently do so learn predominantly in a supervised manner and consequently need sufficient data for robust and efficient learning. Specifically,we categorize the datasets according to use cases, and highlight the datasets that capture complicated and hazardous driving conditions which can be better used for training robust driving models. Furthermore, by discussions of what driving scenarios are not covered by existing public datasets and what driveability factors need more investigation and data acquisition, this paper aims to encourage both targeted dataset collection and the proposal of novel driveability metrics that enhance the robustness of autonomous cars in adverse environments. I. INTRODUCTION Despite testing autonomous cars in highly controlled settings, thesecars still occasionally fail in making correct decisions, often with catastrophic results According to the accident records, the failures are most likely to happen in complex or unseen driving environments. The fact remains that while autonomous cars can operate well in controlled or structured environments such as highways, they are still far from reliable when operating in cluttered, unstructured or unseen environments [2]. These apply to autonomous vehicles in general. Thesetwo different application fields also suggest that driveability could be quantified in different forms, either as a single metric or a composition of metrics. For example, with ADAS and current Level 2 or 3 autonomy, a scene can be simply defined as driveable if the car can operate safely in autonomous mode. When a non-driveable scene is detected, the autonomous car can hand over control to the human driver in a timely manner [4].
Make (Nearly) Every Neural Network Better: Generating Neural Network Ensembles by Weight Parameter Resampling
Liu, Jiayi, Tripathi, Samarth, Kurup, Unmesh, Shah, Mohak
Deep Neural Networks (DNNs) have become increasingly popular in computer vision, natural language processing, and other areas. However, training and fine-tuning a deep learning model is computationally intensive and time-consuming. We propose a new method to improve the performance of nearly every model including pre-trained models. The proposed method uses an ensemble approach where the networks in the ensemble are constructed by reassigning model parameter values based on the probabilistic distribution of these parameters, calculated towards the end of the training process. For pre-trained models, this approach results in an additional training step (usually less than one epoch). We perform a variety of analysis using the MNIST dataset and validate the approach with a number of DNN models using pre-trained models on the ImageNet dataset.
Effective Building Block Design for Deep Convolutional Neural Networks using Search
Dutta, Jayanta K, Liu, Jiayi, Kurup, Unmesh, Shah, Mohak
Deep learning has shown promising results on many machine learning tasks but DL models are often complex networks with large number of neurons and layers, and recently, complex layer structures known as building blocks. Finding the best deep model requires a combination of finding both the right architecture and the correct set of parameters appropriate for that architecture. In addition, this complexity (in terms of layer types, number of neurons, and number of layers) also present problems with generalization since larger networks are easier to overfit to the data. In this paper, we propose a search framework for finding effective architectural building blocks for convolutional neural networks (CNN). Our approach is much faster at finding models that are close to state-of-the-art in performance. In addition, the models discovered by our approach are also smaller than models discovered by similar techniques. We achieve these twin advantages by designing our search space in such a way that it searches over a reduced set of state-of-the-art building blocks for CNNs including residual block, inception block, inception-residual block, ResNeXt block and many others. We apply this technique to generate models for multiple image datasets and show that these models achieve performance comparable to state-of-the-art (and even surpassing the state-of-the-art in one case). We also show that learned models are transferable between datasets.
Using Expectations to Drive Cognitive Behavior
Kurup, Unmesh (Carnegie Mellon University) | Lebiere, Christian (Carnegie Mellon University) | Stentz, Anthony (Carnegie Mellon University) | Hebert, Martial (Carnegie Mellon University)
Generating future states of the world is an essential component of high-level cognitive tasks such as planning. We explore the notion that such future-state generation is more widespread and forms an integral part of cognition. We call these generated states expectations, and propose that cognitive systems constantly generate expectations, match them to observed behavior and react when a difference exists between the two. We describe an ACT-R model that performs expectation-driven cognition on two tasks – pedestrian tracking and behavior classification. The model generates expectations of pedestrian movements to track them. The model also uses differences in expectations to identify distinctive features that differentiate these tracks. During learning, the model learns the association between these features and the various behaviors. During testing, it classifies pedestrian tracks by recalling the behavior associated with the features of each track. We tested the model on both single and multiple behavior datasets and compared the results against a k-NN classifier. The k-NN classifier outperformed the model in correct classifications, but the model had fewer incorrect classifications in the multiple behavior case, and both systems had about equal incorrect classifications in the single behavior case.
Integrating Constraint Satisfaction and Spatial Reasoning
Kurup, Unmesh (Rensselaer Polytechnic Institute) | Cassimatis, Nicholas L. (Rensselaer Polytechnic Institute)
Many problems in AI, including planning, logical reasoning and probabilistic inference, have been shown to reduce to (weighted) constraint satisfaction. While there are a number of approaches for solving such problems, the recent gains in efficiency of the satisfiability approach have made SAT solvers a popular choice. Modern propositional SAT solvers are efficient for a wide variety of problems. However, particularly in the case of spatial reasoning, conversion to propositional SAT can sometimes result in a large number of variables and/or clauses. Moreover, spatial reasoning problems can often be more efficiently solved if the agent is able to exploit the geometric nature of space to make better choices during search and backtracking. The result of these two drawbacks — larger problem sizes and inefficient search — is that even simple spatial constraint problems are often intractable in the SAT approach. In this paper we propose a spatial reasoning system that provides significant performance improvements in constraint satisfaction problems involving spatial predicates. The key to our approach is to integrate a diagrammatic representation with a DPLL-based backtracking algorithm that is specialized for spatial reasoning. The resulting integrated system can be applied to larger and more complex problems than current approaches and can be adopted to improve performance in a variety of problems ranging from planning to probabilistic inference
Reports of the AAAI 2009 Fall Symposia
Azevedo, Roger (University of Memphis) | Bench-Capon, Trevor (University of Liverpool) | Biswas, Gautam (Vanderbilt University) | Carmichael, Ted (University of North Carolina at Charlotte) | Green, Nancy (University of North Carolina at Greensboro) | Hadzikadic, Mirsad (University of North Carolina at Charlotte) | Koyejo, Oluwasanmi (University of Texas) | Kurup, Unmesh (Rensselaer Polytechnic Institute) | Parsons, Simon (Brooklyn College, City University of New York) | Pirrone, Roberto (University of Pirrone) | Prakken, Henry (Utrecht University) | Samsonovich, Alexei (George Mason University) | Scott, Donia (Open University) | Souvenir, Richard (University of North Carolina at Charlotte)
The Association for the Advancement of Artificial Intelligence was pleased to present the 2009 Fall Symposium Series, held Thursday through Saturday, November 5–7, at the Westin Arlington Gateway in Arlington, Virginia. The Symposium Series was preceded on Wednesday, November 4 by a one-day AI funding seminar. The titles of the seven symposia were as follows: (1) Biologically Inspired Cognitive Architectures, (2) Cognitive and Metacognitive Educational Systems, (3) Complex Adaptive Systems and the Threshold Effect: Views from the Natural and Social Sciences, (4) Manifold Learning and Its Applications, (5) Multirepresentational Architectures for Human-Level Intelligence, (6) The Uses of Computational Argumentation, and (7) Virtual Healthcare Interaction.