Goto

Collaborating Authors

 Bayesian Learning


Adversarial Robustness of Flow-Based Generative Models

arXiv.org Machine Learning

Flow-based generative models leverage invertible generator functions to fit a distribution to the training data using maximum likelihood. Despite their use in several application domains, robustness of these models to adversarial attacks has hardly been explored. In this paper, we study adversarial robustness of flow-based generative models both theoretically (for some simple models) and empirically (for more complex ones). First, we consider a linear flow-based generative model and compute optimal sample-specific and universal adversarial perturbations that maximally decrease the likelihood scores. Using this result, we study the robustness of the well-known adversarial training procedure, where we characterize the fundamental trade-off between model robustness and accuracy. Next, we empirically study the robustness of two prominent deep, non-linear, flow-based generative models, namely GLOW and RealNVP. We design two types of adversarial attacks; one that minimizes the likelihood scores of in-distribution samples, while the other that maximizes the likelihood scores of out-of-distribution ones. We find that GLOW and RealNVP are extremely sensitive to both types of attacks. Finally, using a hybrid adversarial training procedure, we significantly boost the robustness of these generative models.


Information-Theoretic Local Minima Characterization and Regularization

arXiv.org Machine Learning

A BSTRACT Recent advances in deep learning theory have evoked the study of generalizabil-ity across different local minima of deep neural networks (DNNs). While current work focused on either discovering properties of good local minima or developing regularization techniques to induce good local minima, no approach exists that can tackle both problems. We achieve these two goals successfully in a unified manner. Specifically, based on the Fisher information we propose a metric both strongly indicative of generalizability of local minima and effectively applied as a practical regularizer. We provide theoretical analysis including a generalization bound and empirically demonstrate the success of our approach in both capturing and improving the generalizability of DNNs. Experiments are performed on CIFAR-10 and CIFAR-100 for various network architectures. 1 I NTRODUCTION Recently, there has been a surge in the interest of acquiring a theoretical understanding over deep neural network's behavior. Breakthroughs have been made in characterizing the optimization process, showing that learning algorithms such as stochastic gradient descent (SGD) tend to end up in one of the many local minima which have close-to-zero training loss (Choromanska et al., 2015; Dauphin et al., 2014; Kawaguchi, 2016; Nguyen & Hein, 2018; Du et al., 2018). It is, therefore, natural to ask two closely related questions: (a) What kind of local minima can generalize better? To our knowledge, existing work focused only on one of the two questions. For the "what" question, various definitions of "flatness/sharpness" have been introduced and analyzed (Keskar et al., 2017; Neyshabur et al., 2018; 2017; Wu et al., 2017; Liang et al., 2017).


Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model

arXiv.org Artificial Intelligence

--Wide spread of satirical news in online communities is an ongoing trend. The nature of satires are so inherently ambiguous that sometimes it's too hard even for humans to understand whether it's actually satire or not. So, research interest has grown in this field. The purpose of this research is to detect Bangla satirical news spread in online news portals as well as social media. In this paper we propose a hybrid technique for extracting feature from text documents combining Word2V ecand TF-IDF. Using our proposed feature extraction technique, with standard CNN architecture we could detect whether a Bangla text document is satire or not with an accuracy of more than 96%. Satires can be considered as a literary form which involves a delicate balance between criticism and humor.



Consistent recovery threshold of hidden nearest neighbor graphs

arXiv.org Machine Learning

Jian Ding, Yihong Wu, Jiaming Xu, and Dana Yang November 20, 2019 Abstract Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2 k -nearest neighbor (NN) graph in an n -vertex complete graph, whose edge weights are independent and distributed according to P n for edges in the hidden 2 k -NN graph and Q n otherwise. We focus on two types of asymptotic recovery guarantees as n: (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o (nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 k n o(1) if lim inf 2ฮฑ n log n 1; (2) almost exact recovery for 1 k o null log n log log nnull if lim inf kD ( P n Q n) log n 1, where ฮฑ n null 2 log null dP ndQ n is the R enyi divergence of order 1 2 and D (P n Q n) is the Kullback-Leibler divergence.


Deep Detector Health Management under Adversarial Campaigns

arXiv.org Machine Learning

Machine learning models are vulnerable to adversarial inputs that induce seemingly unjustifiable errors. As automated classifiers are increasingly used in industrial control systems and machinery, these adversarial errors could grow to be a serious problem. Despite numerous studies over the past few years, the field of adversarial ML is still considered alchemy, with no practical unbroken defenses demonstrated to date, leaving PHM practitioners with few meaningful ways of addressing the problem. We introduce turbidity detection as a practical superset of the adversarial input detection problem, coping with adversarial campaigns rather than statistically invisible one-offs. This perspective is coupled with ROCtheoretic design guidance that prescribes an inexpensive domain adaptation layer at the output of a deep learning model during an attack campaign. The result aims to approximate the Bayes optimal mitigation that ameliorates the detection model's degraded health. A proactively reactive type of prognostics is achieved via Monte Carlo simulation of various adversarial campaign scenarios, by sampling from the model's own turbidity distribution to quickly deploy the correct mitigation during a real-world campaign. A machine learning application often begins with a dataset of examples and the task is to find a classification model that will turn inputs into class-label predictions, while preserving some sense of minimum expected error. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. But less obviously, it is often possible to determin-istically find input examples that force the model to misclas-sify (Szegedy et al., 2014).


Iterative Construction of Gaussian Process Surrogate Models for Bayesian Inference

arXiv.org Machine Learning

A new algorithm is developed to tackle the issue of sampling non-Gaussian model parameter posterior probability distributions that arise from solutions to Bayesian inverse problems. The algorithm aims to mitigate some of the hurdles faced by traditional Markov Chain Monte Carlo (MCMC) samplers, through constructing proposal probability densities that are both, easy to sample and that provide a better approximation to the target density than a simple Gaussian proposal distribution would. To achieve that, a Gaussian proposal distribution is augmented with a Gaussian Process (GP) surface that helps capture non-linearities in the log-likelihood function. In order to train the GP surface, an iterative approach is adopted for the optimal selection of points in parameter space. Optimality is sought by maximizing the information gain of the GP surface using a minimum number of forward model simulation runs. The accuracy of the GP-augmented surface approximation is assessed in two ways. The first consists of comparing predictions obtained from the approximate surface with those obtained through running the actual simulation model at hold-out points in parameter space. The second consists of a measure based on the relative variance of sample weights obtained from sampling the approximate posterior probability distribution of the model parameters. The efficacy of this new algorithm is tested on inferring reaction rate parameters in a 3-node and 6-node network toy problems, which imitate idealized reaction networks in combustion applications.


Causality-based Feature Selection: Methods and Evaluations

arXiv.org Artificial Intelligence

Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this paper, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world data sets. Finally, we discuss some challenging problems to be tackled in future causality-based feature selection research.


Causal inference using Bayesian non-parametric quasi-experimental design

arXiv.org Machine Learning

The de facto standard for causal inference is the randomized controlled trial, where one compares an manipulated group with a control group in order to determine the effect of an intervention. However, this research design is not always realistically possible due to pragmatic or ethical concerns. In these situations, quasi-experimental designs may provide a solution, as these allow for causal conclusions at the cost of additional design assumptions. In this paper, we provide a generic framework for quasi-experimental design using Bayesian model comparison, and we show how it can be used as an alternative to several common research designs. We provide a theoretical motivation for a Gaussian process based approach and demonstrate its convenient use in a number of simulations. Finally, we apply the framework to determine the effect of population-based thresholds for municipality funding in France, of the 2005 smoking ban in Sicily on the number of acute coronary events, and of the effect of an alleged historical phantom border in the Netherlands on Dutch voting behaviour.


List of supervised and unsupervised Machine Learning Algorithms

#artificialintelligence

There are a lot of machine learning algorithm available and among them there are some which are used mostly among the use in daily cases. Some of the Machine Learning algorithm are mentioned.