AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.91)

arXiv.org Machine LearningFeb-21-2019

On the Sensitivity of Adversarial Robustness to Input Data Distributions

Ding, Gavin Weiguang, Lui, Kry Yik Chau, Jin, Xiaomeng, Wang, Luyu, Huang, Ruitong

Neural networks are vulnerable to small adversarial perturbations. Existing literature largely focused on understanding and mitigating the vulnerability of learned models. In this paper, we demonstrate an intriguing phenomenon about the most popular robust training method in the literature, adversarial training: Adversarial robustness, unlike clean accuracy, is sensitive to the input data distribution. Even a semantics-preserving transformations on the input data distribution can cause a significantly different robustness for the adversarial trained model that is both trained and evaluated on the new distribution. Our discovery of such sensitivity on data distribution is based on a study which disentangles the behaviors of clean accuracy and robust accuracy of the Bayes classifier. Empirical investigations further confirm our finding. We construct semantically-identical variants for MNIST and CIFAR10 respectively, and show that standardly trained models achieve comparable clean accuracies on them, but adversarially trained models achieve significantly different robustness accuracies. This counter-intuitive phenomenon indicates that input data distribution alone can affect the adversarial robustness of trained neural networks, not necessarily the tasks themselves. Lastly, we discuss the practical implications on evaluating adversarial robustness, and make initial attempts to understand this complex phenomenon.

accuracy, deep learning, neural network, (20 more...)

1902.08336

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsDec-31-2018

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Lui, Kry, Ding, Gavin Weiguang, Huang, Ruitong, McCann, Robert

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

artificial intelligence, machine learning, precision and recall, (16 more...)

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Neural Information Processing SystemsDec-31-2018

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Lui, Kry, Ding, Gavin Weiguang, Huang, Ruitong, McCann, Robert

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

artificial intelligence, machine learning, precision and recall, (14 more...)

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

arXiv.org Machine LearningDec-6-2018

Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training

Ding, Gavin Weiguang, Sharma, Yash, Lui, Kry Yik Chau, Huang, Ruitong

Despite their impressive performance on various learning tasks, neural networks have been shown to be vulnerable. An otherwise highly accurate network can be completely fooled by an artificially constructed perturbationimperceptible to human perception, known as the adversarial attack (Szegedy et al., 2013; Biggio et al., 2013). Not surprisingly, numerous algorithms in defending adversarial attacks have already been proposed in the literature which, arguably, can be interpreted as different ways in increasing the margins, i.e. the smallest distance from the sample point to the decision boundary induced by the network. Obviously, adversarial robustness is equivalent to large margins. Onetype of the algorithms is to use regularization in the learning to enforce the Lipschitz constant of the network (Cisse et al., 2017; Ross and Doshi-Velez, 2017; Hein and Andriushchenko, 2017; Tsuzuku et al., 2018), thus a small loss sample point would have a large margin since the loss cannot increase too fast. If the Lipschitz constant is regularized on data points, it is usually too local and not accurate in a neighborhood; if it is controlled globally, the constraint on the model is often too strong that it harms accuracy. So far, such methods seem not able to achieve very robust models. There are also efforts using first-order approximation to estimate and maximize input space margin (Elsayed et al., 2018; Sokolic et al., 2017; Matyasko and Chau, 2017). Similarly tolocal Lipschitz regularization, the reliance on local information might not provide accurate margin estimation and efficient maximization.

adversarial training, artificial intelligence, neural network, (14 more...)

1812.02637

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

arXiv.org Machine LearningDec-2-2018

Few-Shot Self Reminder to Overcome Catastrophic Forgetting

Wen, Junfeng, Cao, Yanshuai, Huang, Ruitong

Deep neural networks are known to suffer the catastrophic forgetting problem, where they tend to forget the knowledge from the previous tasks when sequentially learning new tasks. Such failure hinders the application of deep learning based vision system in continual learning settings. In this work, we present a simple yet surprisingly effective way of preventing catastrophic forgetting. Our method, called Few-shot Self Reminder (FSR), regularizes the neural net from changing its learned behaviour by performing logit matching on selected samples kept in episodic memory from the old tasks. Surprisingly, this simplistic approach only requires to retrain a small amount of data in order to outperform previous methods in knowledge retention. We demonstrate the superiority of our method to the previous ones in two different continual learning settings on popular benchmarks, as well as a new continual learning problem where tasks are designed to be more dissimilar.

deep learning, logit, neural network, (17 more...)

1812.00543

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.48)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-31-2018

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Lui, Kry Yik Chau, Ding, Gavin Weiguang, Huang, Ruitong, McCann, Robert J.

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

artificial intelligence, machine learning, precision and recall, (15 more...)

1811.00115

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

arXiv.org Machine LearningMay-9-2018

Improving GAN Training via Binarized Representation Entropy (BRE) Regularization

Cao, Yanshuai, Ding, Gavin Weiguang, Lui, Kry Yik-Chau, Huang, Ruitong

We propose a novel regularizer to improve the training of Generative Adversarial Networks (GANs). The motivation is that when the discriminator D spreads out its model capacity in the right way, the learning signals given to the generator G are more informative and diverse. These in turn help G to explore better and discover the real data manifold while avoiding large unstable jumps due to the erroneous extrapolation made by D. Our regularizer guides the rectifier discriminator D to better allocate its model capacity, by encouraging the binary activation patterns on selected internal layers of D to have a high joint entropy. Experimental results on both synthetic data and real datasets demonstrate improvements in stability and convergence speed of the GAN training, as well as higher sample quality. The approach also leads to higher classification accuracies in semi-supervised learning.

artificial intelligence, iter, neural network, (17 more...)

1805.03644

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Neural Information Processing SystemsDec-31-2016

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities

Huang, Ruitong, Lattimore, Tor, György, András, Szepesvari, Csaba

The follow the leader (FTL) algorithm, perhaps the simplest of all online learning algorithms, is known to perform well when the loss functions it is used on are positively curved. In this paper we ask whether there are other "lucky" settings when FTL achieves sublinear, "small" regret. In particular, we study the fundamental problem of linear prediction over a non-empty convex, compact domain. Amongst other results, we prove that the curvature of the boundary of the domain can act as if the losses were curved: In this case, we prove that as long as the mean of the loss vectors have positive lengths bounded away from zero, FTL enjoys a logarithmic growth rate of regret, while, e.g., for polyhedral domains and stochastic data it enjoys finite expected regret. Building on a previously known meta-algorithm, we also get an algorithm that simultaneously enjoys the worst-case guarantees and the bound available for FTL.

algorithm, artificial intelligence, educational setting, (16 more...)

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.15)

Industry: Education > Educational Setting (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)