AITopics

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Neural Information Processing SystemsFeb-14-2020, 23:57:32 GMT

Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts

Hein, Matthias, Setzer, Simon

Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxation. Although the resulting optimization problems are non-convex and non-smooth, we provide an efficient first-order scheme which scales to large graphs. Moreover, our approach comes with the quality guarantee that given any partition as initialization the algorithm either outputs a better partition or it stops immediately.

artificial intelligence, relaxation, tight relaxation, (3 more...)

Technology: Information Technology > Artificial Intelligence (0.56)

Neural Information Processing SystemsFeb-14-2020, 23:26:52 GMT

Sparse recovery by thresholded non-negative least squares

Slawski, Martin, Hein, Matthias

Non-negative data are commonly encountered in numerous fields, making non-negative least squares regression (NNLS) a frequently used tool. At least relative to its simplicity, it often performs rather well in practice. Serious doubts about its usefulness arise for modern high-dimensional linear models. Even in this setting - unlike first intuition may suggest - we show that for a broad class of designs, NNLS is resistant to overfitting and works excellently for sparse recovery when combined with thresholding, experimentally even outperforming L1-regularization. Since NNLS also circumvents the delicate choice of a regularization parameter, our findings suggest that NNLS may be the method of choice.

artificial intelligence, machine learning, sparse recovery, (2 more...)

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Neural Information Processing SystemsFeb-14-2020, 18:26:28 GMT

The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited

Hein, Matthias, Setzer, Simon, Jost, Leonardo, Rangapuram, Syama Sundar

Hypergraphs allow to encode higher-order relationships in data and are thus a very flexible modeling tool. Current learning methods are either based on approximations of the hypergraphs via graphs or on tensor methods which are only applicable under special conditions. In this paper we present a new learning framework on hypergraphs which fully uses the hypergraph structure. The key element is a family of regularization functionals based on the total variation on hypergraphs. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, hypergraph, machine learning, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.85)

Neural Information Processing SystemsFeb-14-2020, 08:41:43 GMT

Efficient Output Kernel Learning for Multiple Tasks

Jawanpuria, Pratik Kumar, Lapin, Maksim, Hein, Matthias, Schiele, Bernt

The paradigm of multi-task learning is that one can achieve better generalization by learning tasks jointly and thus exploiting the similarity between the tasks rather than learning them independently of each other. While previously the relationship between tasks had to be user-defined in the form of an output kernel, recent approaches jointly learn the tasks and the output kernel. As the output kernel is a positive semidefinite matrix, the resulting optimization problems are not scalable in the number of tasks as an eigendecomposition is required in each step. Using the theory of positive semidefinite kernels we show in this paper that for a certain class of regularizers on the output kernel, the constraint of being positive semidefinite can be dropped as it is automatically satisfied for the relaxed problem. This leads to an unconstrained dual problem which can be solved efficiently.

artificial intelligence, machine learning, output kernel, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningOct-14-2019

Confidence-Calibrated Adversarial Training: Towards Robust Models Generalizing Beyond the Attack Used During Training

Stutz, David, Hein, Matthias, Schiele, Bernt

Adversarial training is the standard to train models robust against adversarial examples. However, especially for complex datasets, adversarial training incurs a significant loss in accuracy and is known to generalize poorly to stronger attacks, e.g., larger perturbations or other threat models. In this paper, we introduce confidence-calibrated adversarial training (CCAT) where the key idea is to enforce that the confidence on adversarial examples decays with their distance to the attacked examples. We show that CCAT preserves better the accuracy of normal training while robustness against adversarial examples is achieved via confidence thresholding. Most importantly, in strong contrast to adversarial training, the robustness of CCAT generalizes to larger perturbations and other threat models, not encountered during training. We also discuss our extensive work to design strong adaptive attacks against CCAT and standard adversarial training which is of independent interest. We present experimental results on MNIST, SVHN and Cifar10.

adversarial example, deep learning, neural network, (19 more...)

1910.06259

Genre: Research Report (0.51)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Machine LearningSep-26-2019

Towards neural networks that provably know when they don't know

Meinke, Alexander, Hein, Matthias

It has recently been shown that ReLU networks produce arbitrarily over-confident predictions far away from the training data. Thus, ReLU networks do not know when they don't know. However, this is a highly important property in safety critical applications. In the context of out-of-distribution detection (OOD) there have been a number of proposals to mitigate this problem but none of them are able to make any mathematical guarantees. In this paper we propose a new approach to OOD which overcomes both problems. Our approach can be used with ReLU networks and provides provably low confidence predictions far away from the training data as well as the first certificates for low confidence predictions in a neighborhood of an out-distribution point. In the experiments we show that state-of-the-art methods fail in this worst-case setting whereas our model can guarantee its performance while retaining state-of-the-art OOD performance.

deep learning, neural network, training data, (20 more...)

1909.1218

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Machine LearningSep-11-2019

Sparse and Imperceivable Adversarial Attacks

Croce, Francesco, Hein, Matthias

Neural networks have been proven to be vulnerable to a variety of adversarial attacks. From a safety perspective, highly sparse adversarial attacks are particularly dangerous. On the other hand the pixelwise perturbations of sparse attacks are typically large and thus can be potentially detected. We propose a new black-box technique to craft adversarial examples aiming at minimizing $l_0$-distance to the original image. Extensive experiments show that our attack is better or competitive to the state of the art. Moreover, we can integrate additional bounds on the componentwise perturbation. Allowing pixels to change only in region of high variation and avoiding changes along axis-aligned edges makes our adversarial examples almost non-perceivable. Moreover, we adapt the Projected Gradient Descent attack to the $l_0$-norm integrating componentwise constraints. This allows us to do adversarial training to enhance the robustness of classifiers against sparse and imperceivable adversarial manipulations.

artificial intelligence, neural network, pixel, (20 more...)

1909.0504

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.81)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

arXiv.org Machine LearningJul-3-2019

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Croce, Francesco, Hein, Matthias

The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as the methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields high quality results already with one restart, minimizes the size of the perturbation, so that the robust accuracy can be evaluated at all possible thresholds with a single run, and comes with almost no free parameters except number of iterations and restarts. It achieves better or similar robust test accuracy compared to state-of-the-art attacks which are partially specialized to one $l_p$-norm.

accuracy, artificial intelligence, neural network, (19 more...)

1907.02044

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJun-8-2019

Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks

Andriushchenko, Maksym, Hein, Matthias

The problem of adversarial samples has been studied extensively for neural networks. However, for boosting, in particular boosted decision trees and decision stumps there are almost no results, even though boosted decision trees, as e.g. XGBoost, are quite popular due to their interpretability and good prediction performance. We show in this paper that for boosted decision stumps the exact min-max optimal robust loss and test error for an $l_\infty$-attack can be computed in $O(n\,T\log T)$, where $T$ is the number of decision stumps and $n$ the number of data points, as well as an optimal update of the ensemble in $O(n^2\,T\log T)$. While not exact, we show how to optimize an upper bound on the robust loss for boosted trees. Up to our knowledge, these are the first algorithms directly optimizing provable robustness guarantees in the area of boosting. We make the code of all our experiments publicly available at https://github.com/max-andr/provably-robust-boosting

decision tree learning, deep learning, robustness, (18 more...)

1906.03526

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.50)
Health & Medicine > Therapeutic Area (0.47)
Government > Military (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)