Overview
Horseshoe Regularization for Machine Learning in Complex and Deep Models
Bhadra, Anindya, Datta, Jyotishka, Li, Yunfan, Polson, Nicholas G.
Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian methodology in machine learning, specifically for high-dimensional regression and classification problems. They have achieved remarkable success in computation, and enjoy strong theoretical support. Most of the existing literature has focused on the linear Gaussian case; see Bhadra et al. (2019) for a systematic survey. The purpose of the current article is to demonstrate that the horseshoe regularization is useful far more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularization in nonlinear and non-Gaussian models; multivariate models; and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems.
A Robust Approach for Securing Audio Classification Against Adversarial Attacks
Esmaeilpour, Mohammad, Cardinal, Patrick, Koerich, Alessandro Lameiras
Adversarial audio attacks can be considered as a small perturbation unperceptive to human ears that is intentionally added to the audio signal and causes a machine learning model to make mistakes. This poses a security concern about the safety of machine learning models since the adversarial attacks can fool such models toward the wrong predictions. In this paper we first review some strong adversarial attacks that may affect both audio signals and their 2D representations and evaluate the resiliency of the most common machine learning model, namely deep learning models and support vector machines (SVM) trained on 2D audio representations such as short time Fourier transform (STFT), discrete wavelet transform (DWT) and cross recurrent plot (CRP) against several state-of-the-art adversarial attacks. Next, we propose a novel approach based on pre-processed DWT representation of audio signals and SVM to secure audio systems against adversarial attacks. The proposed architecture has several preprocessing modules for generating and enhancing spectrograms including dimension reduction and smoothing. We extract features from small patches of the spectrograms using speeded up robust feature (SURF) algorithm which are further used to generate a codebook using the K-Means++ algorithm. Finally, codewords are used to train a SVM on the codebook of the SURF-generated vectors. All these steps yield to a novel approach for audio classification that provides a good trade-off between accuracy and resilience. Experimental results on three environmental sound datasets show the competitive performance of proposed approach compared to the deep neural networks both in terms of accuracy and robustness against strong adversarial attacks.
Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal
Carette, Jacques, Farmer, William M., Kohlhase, Michael, Rabe, Florian
Over the last decades, a class of important mathematical results have required an ever increasing amount of human effort to carry out. For some, the help of computers is now indispensable. We analyze the implications of this trend towards "big mathematics", its relation to human cognition, and how machine support for big math can be organized. The central contribution of this position paper is an information model for "doing mathematics", which posits that humans very efficiently integrate four aspects: inference, computation, tabulation, and narration around a well-organized core of mathematical knowledge. The challenge for mathematical software systems is that these four aspects need to be integrated as well. We briefly survey the state of the art.
Structural Self-adaptation for Decentralized Pervasive Intelligence
Nikolic, Jovan, Pournaras, Evangelos
Communication structure plays a key role in the learning capability of decentralized systems. Structural self-adaptation, by means of self-organization, changes the order as well as the input information of the agents' collective decision-making. This paper studies the role of agents' repositioning on the same communication structure, i.e. a tree, as the means to expand the learning capacity in complex combinatorial optimization problems, for instance, load-balancing power demand to prevent blackouts or efficient utilization of bike sharing stations. The optimality of structural self-adaptations is rigorously studied by constructing a novel large-scale benchmark that consists of 4000 agents with synthetic and real-world data performing 4 million structural self-adaptations during which almost 320 billion learning messages are exchanged. Based on this benchmark dataset, 124 deterministic structural criteria, applied as learning meta-features, are systematically evaluated as well as two online structural self-adaptation strategies designed to expand learning capacity. Experimental evaluation identifies metrics that capture agents with influential information and their optimal positioning. Significant gain in learning performance is observed for the two strategies especially under low-performing initialization. Strikingly, the strategy that triggers structural self-adaptation in a more exploratory fashion is the most cost-effective.
AI Weekly: Contrary to current fears, AI will create jobs and grow GDP
The inevitable march toward automation continues, analysts from the McKinsey Global Institute and from Tata Communications wrote in separate reports this week. Artificial intelligence's growth comes as no surprise -- a survey from Narrative Science and the National Business Research Institute conducted earlier this year found that 61 percent of businesses implemented AI in 2017, up from 38 percent in 2016 -- but this week's findings lay out in detail the likely socioeconomic impacts in the coming decade. The McKinsey models predict that 70 percent of companies will adopt at least one form of AI -- whether computer vision, natural language, virtual assistants, robotic process automation, or advanced machine learning -- by 2020. And Tata found unbridled enthusiasm among business leaders for an AI-dominated future; in a survey of 120 of them, 90 percent said they expect AI to enhance decision-making. McKinsey and Tata both contend that's a good thing.
AI Weekly: Contrary to current fears, AI will create jobs and grow GDP
The inevitable march toward automation continues, analysts from the McKinsey Global Institute and from Tata Communications wrote in separate reports this week. Artificial intelligence's growth comes as no surprise -- a survey from Narrative Science and the National Business Research Institute conducted earlier this year found that 61 percent of businesses implemented AI in 2017, up from 38 percent in 2016 -- but this week's findings lay out in detail the likely socioeconomic impacts in the coming decade. The McKinsey models predict that 70 percent of companies will adopt at least one form of AI -- whether computer vision, natural language, virtual assistants, robotic process automation, or advanced machine learning -- by 2020. And Tata found unbridled enthusiasm among business leaders for an AI-dominated future; in a survey of 120 of them, 90 percent said they expect AI to enhance decision-making. McKinsey and Tata both contend that's a good thing.
PLUME: Polyhedral Learning Using Mixture of Experts
Shah, Kulin, Sastry, P. S., Manwani, Naresh
In this paper, we propose a novel mixture of expert architecture for learning polyhedral classifiers. We learn the parameters of the classifierusing an expectation maximization algorithm. Wederive the generalization bounds of the proposedapproach. Through an extensive simulation study, we show that the proposed method performs comparably to other state-of-the-art approaches.
Derivative-Free Global Optimization Algorithms: Population based Methods and Random Search Approaches
In this paper, we will provide an introduction to the derivative-free optimization algorithms which can be potentially applied to train deep learning models. Existing deep learning model training is mostly based on the back propagation algorithm, which updates the model variables layers by layers with the gradient descent algorithm or its variants. However, the objective functions of deep learning models to be optimized are usually non-convex and the gradient descent algorithms based on the first-order derivative can get stuck into the local optima very easily. To resolve such a problem, various local or global optimization algorithms have been proposed, which can help improve the training of deep learning models greatly. The representative examples include the Bayesian methods, Shubert-Piyavskii algorithm, Direct, LIPO, MCS, GA, SCE, DE, PSO, ES, CMA-ES, hill climbing and simulated annealing, etc. This is a follow-up paper of [18], and we will introduce the population based optimization algorithms, e.g., GA, SCE, DE, PSO, ES and CMA-ES, and random search algorithms, e.g., hill climbing and simulated annealing, in this paper. For the introduction to the other derivative-free optimization algorithms, please refer to [18] for more information.
Derivative-Free Global Optimization Algorithms: Bayesian Method and Lipschitzian Approaches
In this paper, we will provide an introduction to the derivative-free optimization algorithms which can be potentially applied to train deep learning models. Existing deep learning model training is mostly based on the back propagation algorithm, which updates the model variables layers by layers with the gradient descent algorithm or its variants. However, the objective functions of deep learning models to be optimized are usually non-convex and the gradient descent algorithms based on the first-order derivative can get stuck into the local optima very easily. To resolve such a problem, various local or global optimization algorithms have been proposed, which can help improve the training of deep learning models greatly. The representative examples include the Bayesian methods, Shubert-Piyavskii algorithm, Direct, LIPO, MCS, GA, SCE, DE, PSO, ES, CMA-ES, hill climbing and simulated annealing, etc. One part of these algorithms will be introduced in this paper (including the Bayesian method and Lipschitzian approaches, e.g., Shubert-Piyavskii algorithm, Direct, LIPO and MCS), and the remaining algorithms (including the population based optimization algorithms, e.g., GA, SCE, DE, PSO, ES and CMA-ES, and random search algorithms, e.g., hill climbing and simulated annealing) will be introduced in the follow-up paper [18] in detail.
The Seventh Answer Set Programming Competition: Design and Results
Gebser, Martin, Maratea, Marco, Ricca, Francesco
Answer Set Programming (ASP) is a prominent knowledge representation language with roots in logic programming and non-monotonic reasoning. Biennial ASP competitions are organized in order to furnish challenging benchmark collections and assess the advancement of the state of the art in ASP solving. In this paper, we report on the design and results of the Seventh ASP Competition, jointly organized by the University of Calabria (Italy), the University of Genova (Italy), and the University of Potsdam (Germany), in affiliation with the 14th International Conference on Logic Programming and Non-Monotonic Reasoning (LPNMR 2017). (Under consideration for acceptance in TPLP).