AITopics

Flow-based generative models leverage invertible generator functions to fit a distribution to the training data using maximum likelihood. Despite their use in several application domains, robustness of these models to adversarial attacks has hardly been explored. In this paper, we study adversarial robustness of flow-based generative models both theoretically (for some simple models) and empirically (for more complex ones). First, we consider a linear flow-based generative model and compute optimal sample-specific and universal adversarial perturbations that maximally decrease the likelihood scores. Using this result, we study the robustness of the well-known adversarial training procedure, where we characterize the fundamental trade-off between model robustness and accuracy. Next, we empirically study the robustness of two prominent deep, non-linear, flow-based generative models, namely GLOW and RealNVP. We design two types of adversarial attacks; one that minimizes the likelihood scores of in-distribution samples, while the other that maximizes the likelihood scores of out-of-distribution ones. We find that GLOW and RealNVP are extremely sensitive to both types of attacks. Finally, using a hybrid adversarial training procedure, we significantly boost the robustness of these generative models.

adversarial training, generative model, robustness, (15 more...)

1911.08654

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(5 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (0.57)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Yakura, Hiromu, Akimoto, Youhei, Sakuma, Jun

Generate (non-software) Bugs to Fool Classifiers

Let us consider a scenario in which an attacker wishes to modify input image x so that the target model f classifies it with the specific label t . The generation process can be represented as follows: ˆ v argmin v L f ( x v,t) null nullv null, (1) where L f denotes a loss function that represents how distant the input data are from the given label under f and v null null v null is a norm function to regularize the perturbation so that v becomes unnoticeable to humans. Then, x ˆ v is expected to form an adversarial example that is classified as t while it looks similar to x . Earlier approaches, such as Szegedy et al. (2014) and Moosavi-Dezfooli, Fawzi, and Frossard (2016), used L 2-norm to limit the magnitude of the perturbation. In contrast, Su, V argas, and Sakurai (2017) used L 0-norm to limit the number of modified pixels and showed that even modification of a one-pixel could generate adversarial examples. More recent studies introduced GAN instead of directly optimizing perturbations (Xiao et al. 2018; Zhao, Dua, and Singh 2018) for the purpose of ensuring the naturalness of adversarial examples. For example, Xiao et al. (2018) trained a discriminator network to distinguish adversarial examples from natural images so that the generator network produced adversarial examples that appeared as natural images. Given the distribution p x over the natural images and the tradeoff parameter α, its training process can be represented similarly to that in Goodfellow et al. (2014) as follows: min G max D E x p x[log D (x)] E x p x[log (1 D ( x G ( x)))] α E x p x[L f (x G (x),t)] .

adversarial example, perturbation, proceedings, (15 more...)

1911.08644

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)
(5 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Logic-inspired Deep Neural Networks

Le, Minh

Deep neural networks have achieved impressive performance and become de-facto standard in many tasks. However, phenomena such as adversarial examples and fooling examples hint that the generalization they make is flawed. We argue that the problem roots in their distributed and connected nature and propose remedies inspired by propositional logic. Our experiments show that the proposed models are more local and better at resisting fooling and adversarial examples. By means of an ablation analysis, we reveal insights into adversarial examples and suggest a new hypothesis on their origins.

adversarial example, neural network, robustness, (13 more...)

1911.08635

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pang, Guansong, Shen, Chunhua, Hengel, Anton van den

Deep Anomaly Detection with Deviation Networks

Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature representations to enable downstream anomaly detection methods, perform indirect optimization of anomaly scores, leading to data-inefficient learning and suboptimal anomaly scoring. Also, they are typically designed as unsupervised learning due to the lack of large-scale labeled anomaly data. As a result, they are difficult to leverage prior knowledge (e.g., a few labeled anomalies) when such information is available as in many real-world anomaly detection applications. This paper introduces a novel anomaly detection framework and its instantiation to address these problems. Instead of representation learning, our method fulfills an end-to-end learning of anomaly scores by a neural deviation learning, in which we leverage a few (e.g., multiple to dozens) labeled anomalies and a prior probability to enforce statistically significant deviations of the anomaly scores of anomalies from that of normal data objects in the upper tail. Extensive results show that our method can be trained substantially more data-efficiently and achieves significantly better anomaly scoring than state-of-the-art competing methods.

anomaly, anomaly score, devnet, (15 more...)

1911.08623

Country:

Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(4 more...)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Jain, Ayush, Orlitsky, Alon

Robust Learning of Discrete Distributions from Batches

Let $d$ be the lowest $L_1$ distance to which a $k$-symbol distribution $p$ can be estimated from $m$ batches of $n$ samples each, when up to $\beta m$ batches may be adversarial. For $\beta<1/2$, Qiao and Valiant (2017) showed that $d=\Omega(\beta/\sqrt{n})$ and requires $m=\Omega(k/\beta^2)$ batches. For $\beta<1/900$, they provided a $d$ and $m$ order-optimal algorithm that runs in time exponential in $k$. For $\beta<0.5$, we propose an algorithm with comparably optimal $d$ and $m$, but run-time polynomial in $k$ and all other parameters.

algorithm, batch, good batch, (17 more...)

1911.08532

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Gromov-Wasserstein Factorization Models for Graph Clustering

Xu, Hongteng

We propose a new nonlinear factorization model for graphs that are with topological structures, and optionally, node attributes. This model is based on a pseudometric called Gromov-Wasserstein (GW) discrepancy, which compares graphs in a relational way. It estimates observed graphs as GW barycenters constructed by a set of atoms with different weights. By minimizing the GW discrepancy between each observed graph and its GW barycenter-based estimation, we learn the atoms and their weights associated with the observed graphs. The model achieves a novel and flexible factorization mechanism under GW discrepancy, in which both the observed graphs and the learnable atoms can be un-aligned and with different sizes. We design an effective approximate algorithm for learning this Gromov-Wasserstein factorization (GWF) model, unrolling loopy computations as stacked modules and computing gradients with backpropaga-tion. The stacked modules can be with two different architectures, which correspond to the proximal point algorithm (PP A) and Bregman alternating direction method of multipliers (BADMM), respectively. Experiments show that our model obtains encouraging results on clustering graphs. Introduction As an important methodology for machine learning, factorization models explore intrinsic structures of high-dimensional observations explicitly, which have been widely used in many learning tasks, e.g., data clustering (Ng, Jordan, and Weiss 2002), dimensionality reduction (Cand es et al. 2011), recommendation systems (Wang and Blei 2011), etc. In particular, factorization models decompose high-dimensional observations into a set of atoms under specific criteria and achieve their latent representations accordingly.

graph, gw discrepancy, module, (15 more...)

1911.0853

Country:

Asia > Middle East > Jordan (0.24)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

CASTER: Predicting Drug Interactions with Chemical Substructure Representation

Huang, Kexin, Xiao, Cao, Hoang, Trong Nghia, Glass, Lucas M., Sun, Jimeng

Adverse drug-drug interactions (DDIs) remain a leading cause of morbidity and mortality. Identifying potential DDIs during the drug design process is critical for patients and society. Although several computational models have been proposed for DDI prediction, there are still limitations: (1) specialized design of drug representation for DDI predictions is lacking; (2) predictions are based on limited labelled data and do not generalize well to unseen drugs or DDIs; and (3) models are characterized by a large number of parameters, thus are hard to interpret. In this work, we develop a C hemicA l S ubstrucT urE R epresentation ( CASTER) framework that predicts DDIs given chemical structures of drugs. CASTER aims to mitigate these limitations via (1) a sequential pattern mining module rooted in the DDI mechanism to efficiently characterize functional substructures of drugs; (2) an auto-encoding module that leverages both labelled and unlabelled chemical structure data to improve predictive accuracy and generalizability; and (3) a dictionary learning module that explains the prediction via a small set of coefficients which measure the relevance of each input substructures to the DDI outcome. We evaluated CASTER on two real-world DDI datasets and showed that it performed better than state-of-the-art baselines and provided interpretable predictions. 1 Introduction Adverse drug-drug interactions (DDIs) are caused by pharmacological interactions of drugs. They result in a large number of morbidity and mortality, and incur huge medical costs (Giacomini et al. 2007; Onakpoya, Heneghan, and Aronson 2016).

prediction, representation, substructure, (15 more...)

1911.06446

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.54)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Learning internal representations

Baxter, Jonathan

Probably the most important problem in machine learning is the preliminary biasing of a learner's hypothesis space so that it is small enough to ensure good generalisation from reasonable training sets, yet large enough that it contains a good solution to the problem being learnt. In this paper a mechanism for {\em automatically} learning or biasing the learner's hypothesis space is introduced. It works by first learning an appropriate {\em internal representation} for a learning environment and then using that representation to bias the learner's hypothesis space for the learning of future tasks drawn from the same environment. An internal representation must be learnt by sampling from {\em many similar tasks}, not just a single task as occurs in ordinary machine learning. It is proved that the number of examples $m$ {\em per task} required to ensure good generalisation from a representation learner obeys $m = O(a+b/n)$ where $n$ is the number of tasks being learnt and $a$ and $b$ are constants. If the tasks are learnt independently ({\em i.e.} without a common representation) then $m=O(a+b)$. It is argued that for learning environments such as speech and character recognition $b\gg a$ and hence representation learning in these environments can potentially yield a drastic reduction in the number of examples required per task. It is also proved that if $n = O(b)$ (with $m=O(a+b/n)$) then the representation learnt will be good for learning novel tasks from the same environment, and that the number of examples required to generalise well on a novel task will be reduced to $O(a)$ (as opposed to $O(a+b)$ if no representation is used). It is shown that gradient descent can be used to train neural network representations and experiment results are reported providing strong qualitative support for the theoretical results.

generalisation, learner, representation, (16 more...)

doi: 10.1145/225298.225336

1911.05781

Country:

Oceania > Australia > South Australia (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Education (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Deep Unsupervised Clustering with Clustered Generator Model

Zhu, Dandan, Han, Tian, Zhou, Linqi, Yang, Xiaokang, Wu, Ying Nian

However, unsupervised clustering remains one of the most fundamental challenges in machine learning because of high dimensionality of data and high complexities of their hidden structures. Long-established approaches for unsupervised clustering including K-means [15] and Gaussian Mixture Model (GMM) [3] are still the building blocks for numerous applications due to their efficiency and simplicity. However, their distance metrics are limited to data space, making them ineffective for high-dimensional data such as images. Therefore, considerable efforts have been put into obtaining a good feature embedding of data, usually of low dimensionality, for effective clustering [37]. However, the representation obtained by standalone data embedding typically can-Tian Han is the corresponding author not capture the latent structure and variation of the observed data which may be ineffective for clustering. We believe the good representation for clustering should also be able to compactly represent the observed data distribution to encode all necessary characteristics of the observation. Deep generative models (a.k.a the generator models) have shown great promise in learning latent representations for high-dimensional signals such as images and videos [32, 24, 11]. Generator models parameterized by deep neural networks specify a nonlinear mapping from latent variables to observed data.

dataset, generator model, representation, (15 more...)

1911.08459

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany > Brandenburg > Potsdam (0.06)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chokkadi, Sukhada, S, Sannidhan M, B, Sudeepa K, Bhandary, Abhir

A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques

ABSTRACT Considering the existence of very large amount of available data repositories and reach to the very advanced system of hardware, systems meant for facial identification have evolved enormously over the past few decades. Sketch recognitio n is one of the most important areas that have evolved as an integral component adopted by the agencies of law administration in curren t trends of forensic science. Matching of derived sketches to photo images of face is also a difficult assignment as the considered sketches are produced upon the verbal explanation depicted by the eye witness of the crime scene and may have scarcity of se nsitive elements that exist in the photograph as one can accurately depict due to the natural human error. Substantial amount of the novel research work carried out in this area up late used recognition system through traditional extraction and classificat ion models . But very recently, few researches work focused on using deep learning techniques to take an advantage of learning models for the feature extraction and classification to rule out potential domain challenges. The first part of this review paper basically focuses on deep learning techniques used in face recognition and matching which as improved the accuracy of face recognition technique with training of huge sets of data. This paper also includes a survey on different techniques used to match com posite sketches to human images which includes component - based representation approach, automatic composite sketch recognition technique etc. INTRODUCTION As per the researches carried out, a complete face recognition system includes two patterns of face detection and face recognition: 1) Structural similarity and 2) individual local differences of human faces. Therefore, it is required to extract the features of the face through the face detection process. The evolution of face recognition is due to its technical challenges and huge potential application in video surveillance, identity authorization, multimedia applications, home and office security, law enforcement and different human - computer interaction activities. Facial recognition technology (FRT) is one of the most controversial new tools. It was first devel oped in the 1960s.

face recognition, recognition, sketch, (13 more...)

doi: 10.30534/ijatcse/2019/84842019

1911.08426

Country:

Asia > India (0.04)
North America > United States (0.04)
Europe > Finland > Northern Ostrobothnia > Oulu (0.04)

Genre:

Overview (0.88)
Research Report > Promising Solution (0.68)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)