Goto

Collaborating Authors

 Banff


On the Expected Complexity of Maxout Networks

arXiv.org Machine Learning

Learning with neural networks relies on the complexity of the representable functions, but more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, recent works have shown that the practical complexity of deep ReLU networks is often far from the theoretical maximum. In this work we show that this phenomenon also occurs in networks with maxout (multi-argument) activation functions and when considering the decision boundaries in classification tasks. We also show that the parameter space has a multitude of full-dimensional regions with widely different complexity, and obtain nontrivial lower bounds on the expected complexity. Finally, we investigate different parameter initialization procedures and show that they can increase the speed of convergence in training.


Inconspicuous Adversarial Patches for Fooling Image Recognition Systems on Mobile Devices

arXiv.org Artificial Intelligence

Deep learning based image recognition systems have been widely deployed on mobile devices in today's world. In recent studies, however, deep learning models are shown vulnerable to adversarial examples. One variant of adversarial examples, called adversarial patch, draws researchers' attention due to its strong attack abilities. Though adversarial patches achieve high attack success rates, they are easily being detected because of the visual inconsistency between the patches and the original images. Besides, it usually requires a large amount of data for adversarial patch generation in the literature, which is computationally expensive and time-consuming. To tackle these challenges, we propose an approach to generate inconspicuous adversarial patches with one single image. In our approach, we first decide the patch locations basing on the perceptual sensitivity of victim models, then produce adversarial patches in a coarse-to-fine way by utilizing multiple-scale generators and discriminators. The patches are encouraged to be consistent with the background images with adversarial training while preserving strong attack abilities. Our approach shows the strong attack abilities in white-box settings and the excellent transferability in black-box settings through extensive experiments on various models with different architectures and training methods. Compared to other adversarial patches, our adversarial patches hold the most negligible risks to be detected and can evade human observations, which is supported by the illustrations of saliency maps and results of user evaluations. Lastly, we show that our adversarial patches can be applied in the physical world.


Universal Adversarial Perturbations for CNN Classifiers in EEG-Based BCIs

arXiv.org Artificial Intelligence

Multiple convolutional neural network (CNN) classifiers have been proposed for electroencephalogram (EEG) based brain-computer interfaces (BCIs). However, CNN models have been found vulnerable to universal adversarial perturbations (UAPs), which are small and example-independent, yet powerful enough to degrade the performance of a CNN model, when added to a benign example. This paper proposes a novel total loss minimization (TLM) approach to generate UAPs for EEG-based BCIs. Experimental results demonstrated the effectiveness of TLM on three popular CNN classifiers for both target and non-target attacks. We also verified the transferability of UAPs in EEG-based BCI systems. To our knowledge, this is the first study on UAPs of CNN classifiers in EEG-based BCIs. UAPs are easy to construct, and can attack BCIs in real-time, exposing a potentially critical security concern of BCIs.


Learning and Generalization in Overparameterized Normalizing Flows

arXiv.org Artificial Intelligence

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using stochastic gradient descent with sufficiently small learning rate and suitable initialization. In contrast, the benefit of overparameterization in unsupervised learning is not well understood. Normalizing flows (NFs) constitute an important class of models in unsupervised learning for sampling and density estimation. In this paper, we theoretically and empirically analyze these models when the underlying neural network is one-hidden-layer overparameterized network. Our main contributions are two-fold: (1) On the one hand, we provide theoretical and empirical evidence that for a class of NFs containing most of the existing NF models, overparametrization hurts training. (2) On the other hand, we prove that unconstrained NFs, a recently introduced model, can efficiently learn any reasonable data distribution under minimal assumptions when the underlying network is overparametrized.


Message Passing in Graph Convolution Networks via Adaptive Filter Banks

arXiv.org Artificial Intelligence

Graph convolution networks, like message passing graph convolution networks (MPGCNs), have been a powerful tool in representation learning of networked data. However, when data is heterogeneous, most architectures are limited as they employ a single strategy to handle multi-channel graph signals and they typically focus on low-frequency information. In this paper, we present a novel graph convolution operator, termed BankGCN, which keeps benefits of message passing models, but extends their capabilities beyond `low-pass' features. It decomposes multi-channel signals on graphs into subspaces and handles particular information in each subspace with an adapted filter. The filters of all subspaces have different frequency responses and together form a filter bank. Furthermore, each filter in the spectral domain corresponds to a message passing scheme, and diverse schemes are implemented via the filter bank. Importantly, the filter bank and the signal decomposition are jointly learned to adapt to the spectral characteristics of data and to target applications. Furthermore, this is implemented almost without extra parameters in comparison with most existing MPGCNs. Experimental results show that the proposed convolution operator permits to achieve excellent performance in graph classification on a collection of benchmark graph datasets.


Boosting Randomized Smoothing with Variance Reduced Classifiers

arXiv.org Artificial Intelligence

Randomized Smoothing (RS) is a promising method for obtaining robustness certificates by evaluating a base model under noise. In this work we: (i) theoretically motivate why ensembles are a particularly suitable choice as base models for RS, and (ii) empirically confirm this choice, obtaining state of the art results in multiple settings. The key insight of our work is that the reduced variance of ensembles over the perturbations introduced in RS leads to significantly more consistent classifications for a given input, in turn leading to substantially increased certifiable radii for difficult samples. We also introduce key optimizations which enable an up to 50-fold decrease in sample complexity of RS, thus drastically reducing its computational overhead. Experimentally, we show that ensembles of only 3 to 10 classifiers consistently improve on the strongest single model with respect to their average certified radius (ACR) by 5% to 21% on both CIFAR-10 and ImageNet. On the latter, we achieve a state-of-the-art ACR of 1.11. We release all code and models required to reproduce our results upon publication.


AI-enabled Automation for Completeness Checking of Privacy Policies

arXiv.org Artificial Intelligence

Technological advances in information sharing have raised concerns about data protection. Privacy policies contain privacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g., a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). A prerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisions of GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related software specifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automation for the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterize the privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop an automated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machine learning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against the completeness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseen privacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23 false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword search only, our approach results in an improvement of 24.5% in precision and 38% in recall.


Compositional Modeling of Nonlinear Dynamical Systems with ODE-based Random Features

arXiv.org Machine Learning

Effectively modeling phenomena present in highly nonlinear dynamical systems whilst also accurately quantifying uncertainty is a challenging task, which often requires problem-specific techniques. We present a novel, domain-agnostic approach to tackling this problem, using compositions of physics-informed random features, derived from ordinary differential equations. The architecture of our model leverages recent advances in approximate inference for deep Gaussian processes, such as layer-wise weight-space approximations which allow us to incorporate random Fourier features, and stochastic variational inference for approximate Bayesian inference. We provide evidence that our model is capable of capturing highly nonlinear behaviour in real-world multivariate time series data. In addition, we find that our approach achieves comparable performance to a number of other probabilistic models on benchmark regression tasks.


Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

arXiv.org Artificial Intelligence

Today's advanced Reinforcement Learning algorithms produce black-box policies, that are often difficult to interpret and trust for a person. We introduce a policy distilling algorithm, building on the CN2 rule mining algorithm, that distills the policy into a rule-based decision system. At the core of our approach is the fact that an RL process does not just learn a policy, a mapping from states to actions, but also produces extra meta-information, such as action values indicating the quality of alternative actions. This meta-information can indicate whether more than one action is near-optimal for a certain state. We extend CN2 to make it able to leverage knowledge about equally-good actions to distill the policy into fewer rules, increasing its interpretability by a person. Then, to ensure that the rules explain a valid, non-degenerate policy, we introduce a refinement algorithm that fine-tunes the rules to obtain good performance when executed in the environment. We demonstrate the applicability of our algorithm on the Mario AI benchmark, a complex task that requires modern reinforcement learning algorithms including neural networks. The explanations we produce capture the learned policy in only a few rules, that allow a person to understand what the black-box agent learned.


A general approach for Explanations in terms of Middle Level Features

arXiv.org Artificial Intelligence

Nowadays, it is growing interest to make Machine Learning (ML) systems more understandable and trusting to general users. Thus, generating explanations for ML system behaviours that are understandable to human beings is a central scientific and technological issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence (XAI). Recently, it is becoming more and more evident that new directions to create better explanations should take into account what a good explanation is to a human user, and consequently, develop XAI solutions able to provide user-centred explanations. This paper suggests taking advantage of developing an XAI general approach that allows producing explanations for an ML system behaviour in terms of different and user-selected input features, i.e., explanations composed of input properties that the human user can select according to his background knowledge and goals. To this end, we propose an XAI general approach which is able: 1) to construct explanations in terms of input features that represent more salient and understandable input properties for a user, which we call here Middle-Level input Features (MLFs), 2) to be applied to different types of MLFs. We experimentally tested our approach on two different datasets and using three different types of MLFs. The results seem encouraging.