Collaborating Authors

Schramowski, Patrick

Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content? Artificial Intelligence

Large datasets underlying much of current machine learning raise serious issues concerning inappropriate content such as offensive, insulting, threatening, or might otherwise cause anxiety. This calls for increased dataset documentation, e.g., using datasheets. They, among other topics, encourage to reflect on the composition of the datasets. So far, this documentation, however, is done manually and therefore can be tedious and error-prone, especially for large image datasets. Here we ask the arguably "circular" question of whether a machine can help us reflect on inappropriate content, answering Question 16 in Datasheets. To this end, we propose to use the information stored in pre-trained transformer models to assist us in the documentation process. Specifically, prompt-tuning based on a dataset of socio-moral values steers CLIP to identify potentially inappropriate content, therefore reducing human labor. We then document the inappropriate images found using word clouds, based on captions generated using a vision-language model. The documentations of two popular, large-scale computer vision datasets -- ImageNet and OpenImages -- produced this way suggest that machines can indeed help dataset creators to answer Question 16 on inappropriate image content.

Inferring Offensiveness In Images From Natural Language Supervision Artificial Intelligence

Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data. Unfortunately, these approaches also entail severe risks. In particular, large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images, and may also underrepresent specific classes. Consequently, there is an urgent need to carefully document datasets and curate their content. Unfortunately, this process is tedious and error-prone. We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets. Based on human-annotated examples and the implicit knowledge of a CLIP based model, we demonstrate that one can select relevant prompts for rating the offensiveness of an image. Deep learning models yielded many improvements in several fields. Particularly, transfer learning from models pre-trained on large-scale supervised data has become common practice in many tasks both with and without sufficient data to train deep learning models. While approaches like semisupervised sequence learning (Dai & Le, 2015) and datasets such as ImageNet (Deng et al., 2009), especially the ImageNet-ILSVRC-2012 dataset with 1.2 million images, established pre-training approaches, in the following years, the training data size increased rapidly to billions of training examples (Brown et al., 2020; Jia et al., 2021), steadily improving the capabilities of deep models.

Interactively Generating Explanations for Transformer Language Models Artificial Intelligence

Transformer language models are state-of-the-art in a multitude of NLP tasks. Despite these successes, their opaqueness remains problematic. Recent methods aiming to provide interpretability and explainability to black-box models primarily focus on post-hoc explanations of (sometimes spurious) input-output correlations. Instead, we emphasize using prototype networks directly incorporated into the model architecture and hence explain the reasoning process behind the network's decisions. Moreover, while our architecture performs on par with several language models, it enables one to learn from user interactions. This not only offers a better understanding of language models but uses human capabilities to incorporate knowledge outside of the rigid range of purely data-driven approaches.

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations Artificial Intelligence

These "visual" explanations are often insufficient, as the model's actual concept remains elusive. Moreover, without insights into the model's semantic concept, it is difficult --if not impossible-- to intervene on the model's behavior via its explanations, called Explanatory Interactive Learning. Consequently, we propose to intervene on a Neuro-Symbolic scene representation, which allows one to revise the model on the semantic level, e.g. "never focus on the color to make your decision". We compiled a novel confounded visual scene data set, the CLEVR-Hans data set, capturing complex compositions of different objects. The results of our experiments on CLEVR-Hans demonstrate that our semantic explanations, i.e. Figure 1: Neuro-Symbolic explanations are needed to revise compositional explanations at a per-object level, can identify deep learning models from focusing on irrelevant features confounders that are not identifiable using "visual" explanations via global feedback rules.

Right for the Wrong Scientific Reasons: Revising Deep Networks by Interacting with their Explanations Artificial Intelligence

Right for the Wrong Scientific Reasons: Revising Deep Networks by Interacting with their Explanations Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Franziska Herbert, Xiaoting Shao, Hans-Georg Luigs Anne-Katrin Mahlein & Kristian Kersting Abstract Deep neural networks have shown excellent performances in many real-world applications. Unfortunately, they may show "Clever Hans"-like behavior--making use of confounding factors within datasets--to achieve high performance. In this work we introduce the novel learning setting of explanatory interactive learning (XIL) and illustrate its benefits on a plant phenotyping research task. XIL adds the scientist into the training loop such that she interactively revises the original model via providing feedback on its explanations. Our experimental results demonstrate that XIL can help avoiding Clever Hans moments in machine learning and encourages (or discourages, if appropriate) trust into the underlying model. Imagine a plant phenotyping team attempting to characterize crop resistance to plant pathogens. The plant physiologist records a larger amount of hyperspectral imaging data. Impressed by the results of deep learning in other scientific areas, she wants to establish similar results for phenotyping. Consequently, she asks a machine learning expert to apply deep learning to analyze the data. Luckily, the resulting predictive accuracy is very high. The plant physiologist, however, remains skeptical. The results are "too good, to be true". Checking the decision process of the deep model using explainable artificial intelligence (AI), the machine learning expert is flabbergasted to find that the learned deep model uses clues within the data that do not relate to the biological problem at hand, so-called confounding factors. The physiologist loses trust in AI and turns away from it, proclaiming it to be useless. Indeed, the seminal paper of Lapuschkin et al. [3] helps in "unmasking Clever Hans predictors and assessing what machines really learn". However, rather than proclaiming, as the plant physiologist might, that the machines have learned the right predictions for wrong reasons and can therefore not be trusted, we here showcase that interactions between the learning system and the human user can correct the model towards making the right predictions for the right reasons. This may also increase the trust in machine learning models.

BERT has a Moral Compass: Improvements of ethical and moral values of machines Artificial Intelligence

Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? Jentzsch et al.(2019) showed that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct by calculating a moral bias score on a sentence level using sentence embeddings. The machine learned that it is objectionable to kill living beings, but it is fine to kill time; It is essential to eat, yet one might not eat dirt; it is important to spread information, yet one should not spread misinformation. However, the evaluated moral bias was restricted to simple actions -- one verb -- and a ranking of actions with surrounding context. Recently BERT ---and variants such as RoBERTa and SBERT--- has set a new state-of-the-art performance for a wide range of NLP tasks. But has BERT also a better moral compass? In this paper, we discuss and show that this is indeed the case. Thus, recent improvements of language representations also improve the representation of the underlying ethical and moral values of the machine. We argue that through an advanced semantic representation of text, BERT allows one to get better insights of moral and ethical values implicitly represented in text. This enables the Moral Choice Machine (MCM) to extract more accurate imprints of moral choices and ethical values.

Neural Conditional Gradients Machine Learning

The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers. When facing a constrained problem, however, maintaining feasibility typically requires a projection step, which might be computationally expensive and not differentiable. We show how the design of projection-free convex optimization algorithms can be cast as a learning problem based on Frank-Wolfe Networks: recurrent networks implementing the Frank-Wolfe algorithm aka. conditional gradients. This allows them to learn to exploit structure when, e.g., optimizing over rank-1 matrices. Our LSTM-learned optimizers outperform hand-designed as well learned but unconstrained ones. We demonstrate this for training support vector machines and softmax classifiers.