Overview
A Selective Survey on Versatile Knowledge Distillation Paradigm for Neural Network Models
Ku, Jeong-Hoe, Oh, JiHun, Lee, YoungYoon, Pooniwala, Gaurav, Lee, SangJeong
This paper aims to provide a selective survey about knowledge distillation(KD) framework for researchers and practitioners to take advantage of it for developing new optimized models in the deep neural network field. To this end, we give a brief overview of knowledge distillation and some related works including learning using privileged information(LUPI) and generalized distillation(GD). Even though knowledge distillation based on the teacher-student architecture was initially devised as a model compression technique, it has found versatile applications over various frameworks. In this paper, we review the characteristics of knowledge distillation from the hypothesis that the three important ingredients of knowledge distillation are distilled knowledge and loss,teacher-student paradigm, and the distillation process. In addition, we survey the versatility of the knowledge distillation by studying its direct applications and its usage in combination with other deep learning paradigms. Finally we present some future works in knowledge distillation including explainable knowledge distillation where the analytical analysis of the performance gain is studied and the self-supervised learning which is a hot research topic in deep learning community.
Persistent Reductions in Regularized Loss Minimization for Variable Selection
In the context of regularized loss minimization with polyhedral gauges, we show that for a broad class of loss functions (possibly non-smooth and non-convex) and under a simple geometric condition on the input data it is possible to efficiently identify a subset of features which are guaranteed to have zero coefficients in all optimal solutions in all problems with loss functions from said class, before any iterative optimization has been performed for the original problem. This procedure is standalone, takes only the data as input, and does not require any calls to the loss function. Therefore, we term this procedure as a persistent reduction for the aforementioned class of regularized loss minimization problems. This reduction can be efficiently implemented via an extreme ray identification subroutine applied to a polyhedral cone formed from the datapoints. We employ an existing output-sensitive algorithm for extreme ray identification which makes our guarantee and algorithm applicable in ultra-high dimensional problems.
A Brief Introduction to Edge Computing and Deep Learning
Welcome to my first blog on topics in artificial intelligence! Here I will introduce the topic of edge computing, with context in deep learning applications. This blog is largely adapted from a survey paper written by Xiaofei Wang et al.: Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. If you're interested in learning more about any topic covered here, there are plenty of examples, figures, and explanations in the full 35 page survery: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp & arnumber 8976180 Now, before we begin, I'd like to take a moment and motivate why edge computing and deep learning can be very powerful when combined: Deep learning is becoming an increasingly-capable practice in machine learning that allows computers to detect objects, recognize speech, translate languages, and make decisions. More problems in machine learning are solved with the advanced techniques that researchers discover by the day.
AI Weekly: The state of machine learning in 2020
It's hard to believe, but a year in which the unprecedented seemed to happen every day is just weeks from being over. In AI circles, the end of the calendar year means the rollout of annual reports aimed at defining progress, impact, and areas for improvement. The AI Index is due out in the coming weeks, as is CB Insights' assessment of global AI startup activity, but two reports -- both called The State of AI -- have already been released. Last week, McKinsey released its global survey on the state of AI, a report now in its third year. Interviews with executives and a survey of business respondents found a potential widening of the gap between businesses that apply AI and those that do not.
MetaGater: Fast Learning of Conditional Channel Gated Networks via Federated Meta-Learning
Lin, Sen, Yang, Li, He, Zhezhi, Fan, Deliang, Zhang, Junshan
While deep learning has achieved phenomenal successes in many AI applications, its enormous model size and intensive computation requirements pose a formidable challenge to the deployment in resource-limited nodes. There has recently been an increasing interest in computationally-efficient learning methods, e.g., quantization, pruning and channel gating. However, most existing techniques cannot adapt to different tasks quickly. In this work, we advocate a holistic approach to jointly train the backbone network and the channel gating which enables dynamical selection of a subset of filters for more efficient local computation given the data input. Particularly, we develop a federated meta-learning approach to jointly learn good meta-initializations for both backbone networks and gating modules, by making use of the model similarity across learning tasks on different nodes. In this way, the learnt meta-gating module effectively captures the important filters of a good meta-backbone network, based on which a task-specific conditional channel gated network can be quickly adapted, i.e., through one-step gradient descent, from the meta-initializations in a two-stage procedure using new samples of that task. The convergence of the proposed federated meta-learning algorithm is established under mild conditions. Experimental results corroborate the effectiveness of our method in comparison to related work.
A Survey on Data Pricing: from Economics to Data Science
How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.
When Machine Learning Meets Privacy: A Survey and Outlook
The newly emerged machine learning (e.g. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the context of machine learning is quite different from that in traditional data privacy protection, as machine learning can act as both friend and foe. Currently, the work on the preservation of privacy and machine learning (ML) is still in an infancy stage, as most existing solutions only focus on privacy problems during the machine learning process. Therefore, a comprehensive study on the privacy preservation problems and machine learning is required. This paper surveys the state of the art in privacy issues and solutions for machine learning.
Meta-learning in natural and artificial intelligence
Humans are remarkable for continuously learning throughout the entirety of their lives, from acquiring physical reasoning and language skills at a young age [64, 43], to the ability to reason about the detailed complexities inherent in everyday adult life. One key quality of this learning is that it happens at multiple scales, both in terms of time and abstraction, in a process termed meta-learning or learning to learn. The fundamental principle of meta-learning is that learning proceeds faster with more experience, via the acquisition of inductive biases or knowledge that allows for more efficient learning in the future [66, 59, 57]. These favorable properties of meta-learning have recently gained it considerable renewed interest within the deep learning/artificial intelligence community. Despite their tremendous successes in recent years [46, 61], deep learning systems still require many orders of magnitude of data than humans [40, 12]. Although early work demonstrated the feasibility for neural networks to discover their own learning rules [10, 58], it was only recently that the field has experienced a resurgence of new research in meta-learning using deep neural networks. This has demonstrated the wide-ranging potential of neural networks to meta-learn all aspects of the learning process. Deep neural networks are typically trained via backpropagation, which adjusts the weights of the neural network so that given a set of input data, the network outputs match some desired target outputs (e.g., classification labels).
Achievements and Challenges in Explaining Deep Learning based Computer-Aided Diagnosis Systems
Lucieri, Adriano, Bajwa, Muhammad Naseer, Dengel, Andreas, Ahmed, Sheraz
Remarkable success of modern image-based AI methods and the resulting interest in their applications in critical decision-making processes has led to a surge in efforts to make such intelligent systems transparent and explainable. The need for explainable AI does not stem only from ethical and moral grounds but also from stricter legislation around the world mandating clear and justifiable explanations of any decision taken or assisted by AI. Especially in the medical context where Computer-Aided Diagnosis can have a direct influence on the treatment and well-being of patients, transparency is of utmost importance for safe transition from lab research to real world clinical practice. This paper provides a comprehensive overview of current state-of-the-art in explaining and interpreting Deep Learning based algorithms in applications of medical research and diagnosis of diseases. We discuss early achievements in development of explainable AI for validation of known disease criteria, exploration of new potential biomarkers, as well as methods for the subsequent correction of AI models. Various explanation methods like visual, textual, post-hoc, ante-hoc, local and global have been thoroughly and critically analyzed. Subsequently, we also highlight some of the remaining challenges that stand in the way of practical applications of AI as a clinical decision support tool and provide recommendations for the direction of future research.
Modular Structures and Atomic Decomposition in Ontologies
Del Vescovo, Chiara (BBC) | Horridge, Matthew (Stanford University) | Parsia, Bijan (University of Manchester) | Sattler, Uli (University of Manchester) | Schneider, Thomas (University of Bremen) | Zhao, Haoruo (University of Manchester)
With the growth of ontologies used in diverse application areas, the need for module extraction and modularisation techniques has risen. The notion of the modular structure of an ontology, which comprises a suitable set of base modules together with their logical dependencies, has the potential to help users and developers in comprehending, sharing, and maintaining an ontology. We have developed a new modular structure, called atomic decomposition (AD), which is based on modules that provide strong logical properties, such as locality-based modules. In this article, we present the theoretical foundations of AD, review its logical and computational properties, discuss its suitability as a modular structure, and report on an experimental evaluation of AD. In addition, we discuss the concept of a modular structure in ontology engineering and provide a survey of existing decomposition approaches.