Collaborating Authors


Latent gaze information in highly dynamic decision-tasks Artificial Intelligence

Digitization is penetrating more and more areas of life. Tasks are increasingly being completed digitally, and are therefore not only fulfilled faster, more efficiently but also more purposefully and successfully. The rapid developments in the field of artificial intelligence in recent years have played a major role in this, as they brought up many helpful approaches to build on. At the same time, the eyes, their movements, and the meaning of these movements are being progressively researched. The combination of these developments has led to exciting approaches. In this dissertation, I present some of these approaches which I worked on during my Ph.D. First, I provide insight into the development of models that use artificial intelligence to connect eye movements with visual expertise. This is demonstrated for two domains or rather groups of people: athletes in decision-making actions and surgeons in arthroscopic procedures. The resulting models can be considered as digital diagnostic models for automatic expertise recognition. Furthermore, I show approaches that investigate the transferability of eye movement patterns to different expertise domains and subsequently, important aspects of techniques for generalization. Finally, I address the temporal detection of confusion based on eye movement data. The results suggest the use of the resulting model as a clock signal for possible digital assistance options in the training of young professionals. An interesting aspect of my research is that I was able to draw on very valuable data from DFB youth elite athletes as well as on long-standing experts in arthroscopy. In particular, the work with the DFB data attracted the interest of radio and print media, namely DeutschlandFunk Nova and SWR DasDing. All resulting articles presented here have been published in internationally renowned journals or at conferences.

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection Artificial Intelligence

Language models increasingly rely on massive web dumps for diverse text data. However, these sources are rife with undesirable content. As such, resources like Wikipedia, books, and newswire often serve as anchors for automatically selecting web text most suitable for language modeling, a process typically referred to as quality filtering. Using a new dataset of U.S. high school newspaper articles -- written by students from across the country -- we investigate whose language is preferred by the quality filter used for GPT-3. We find that newspapers from larger schools, located in wealthier, educated, and urban ZIP codes are more likely to be classified as high quality. We then demonstrate that the filter's measurement of quality is unaligned with other sensible metrics, such as factuality or literary acclaim. We argue that privileging any corpus as high quality entails a language ideology, and more care is needed to construct training corpora for language models, with better transparency and justification for the inclusion or exclusion of various texts.

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019 Artificial Intelligence

This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (, the open-sourced code of the winners, and a free "AutoDL self-service".

Time Series Forecasting Using Fuzzy Cognitive Maps: A Survey Artificial Intelligence

Increasing complexity comes from some factors including uncertainty, ambiguity, inconsistency, multiple dimensionalities, increasing the number of effective factors and relation between them. Some of these features are common among most real-world problems which are considered complex and dynamic problems. In other words, since the data and relations in real world applications are usually highly complex and inaccurate, modeling real complex systems based on observed data is a challenging task especially for large scale, inaccurate and non stationary datasets. Therefore, to cover and address these difficulties, the existence of a computational system with the capability of extracting knowledge from the complex system with the ability to simulate its behavior is essential. In other words, it is needed to find a robust approach and solution to handle real complex problems in an easy and meaningful way [1]. Hard computing methods depend on quantitative values with expensive solutions and lack of ability to represent the problem in real life due to some uncertainties. In contrast, soft computing approaches act as alternative tools to deal with the reasoning of complex problems [2]. Using soft computing methods such as fuzzy logic, neural network, genetic algorithms or a combination of these allows achieving robustness, tractable and more practical solutions. Generally, two types of methods are used for analyzing and modeling dynamic systems including quantitative and qualitative approaches.

Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence Artificial Intelligence

Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.

Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies Machine Learning

Unrolled computation graphs arise in many scenarios, including training RNNs, tuning hyperparameters through unrolled optimization, and training learned optimizers. Current approaches to optimizing parameters in such computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance characteristics. We experimentally demonstrate the advantages of PES compared to several other methods for gradient estimation on synthetic tasks, and show its applicability to training learned optimizers and tuning hyperparameters.

The Statistical Complexity of Interactive Decision Making Machine Learning

A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is analogous to the classical problem of optimal (supervised) statistical learning, where there are well-known complexity measures (e.g., VC dimension and Rademacher complexity) that govern the statistical complexity of learning. However, characterizing the statistical complexity of interactive learning is substantially more challenging due to the adaptive nature of the problem. The main result of this work provides a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning. In particular, we provide: 1. a lower bound on the optimal regret for any interactive decision making problem, establishing the Decision-Estimation Coefficient as a fundamental limit. 2. a unified algorithm design principle, Estimation-to-Decisions (E2D), which transforms any algorithm for supervised estimation into an online algorithm for decision making. E2D attains a regret bound matching our lower bound, thereby achieving optimal sample-efficient learning as characterized by the Decision-Estimation Coefficient. Taken together, these results constitute a theory of learnability for interactive decision making. When applied to reinforcement learning settings, the Decision-Estimation Coefficient recovers essentially all existing hardness results and lower bounds. More broadly, the approach can be viewed as a decision-theoretic analogue of the classical Le Cam theory of statistical estimation; it also unifies a number of existing approaches -- both Bayesian and frequentist.

Class-aware Sounding Objects Localization via Audiovisual Correspondence Artificial Intelligence

Audiovisual scenes are pervasive in our daily life. It is commonplace for humans to discriminatively localize different sounding objects but quite challenging for machines to achieve class-aware sounding objects localization without category annotations, i.e., localizing the sounding object and recognizing its category. To address this problem, we propose a two-stage step-by-step learning framework to localize and recognize sounding objects in complex audiovisual scenarios using only the correspondence between audio and vision. First, we propose to determine the sounding area via coarse-grained audiovisual correspondence in the single source cases. Then visual features in the sounding area are leveraged as candidate object representations to establish a category-representation object dictionary for expressive visual character extraction. We generate class-aware object localization maps in cocktail-party scenarios and use audiovisual correspondence to suppress silent areas by referring to this dictionary. Finally, we employ category-level audiovisual consistency as the supervision to achieve fine-grained audio and sounding object distribution alignment. Experiments on both realistic and synthesized videos show that our model is superior in localizing and recognizing objects as well as filtering out silent ones. We also transfer the learned audiovisual network into the unsupervised object detection task, obtaining reasonable performance.

Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies Artificial Intelligence

As AI systems demonstrate increasingly strong predictive performance, their adoption has grown in numerous domains. However, in high-stakes domains such as criminal justice and healthcare, full automation is often not desirable due to safety, ethical, and legal concerns, yet fully manual approaches can be inaccurate and time consuming. As a result, there is growing interest in the research community to augment human decision making with AI assistance. Besides developing AI technologies for this purpose, the emerging field of human-AI decision making must embrace empirical approaches to form a foundational understanding of how humans interact and work with AI to make decisions. To invite and help structure research efforts towards a science of understanding and improving human-AI decision making, we survey recent literature of empirical human-subject studies on this topic. We summarize the study design choices made in over 100 papers in three important aspects: (1) decision tasks, (2) AI models and AI assistance elements, and (3) evaluation metrics. For each aspect, we summarize current trends, discuss gaps in current practices of the field, and make a list of recommendations for future research. Our survey highlights the need to develop common frameworks to account for the design and research spaces of human-AI decision making, so that researchers can make rigorous choices in study design, and the research community can build on each other's work and produce generalizable scientific knowledge. We also hope this survey will serve as a bridge for HCI and AI communities to work together to mutually shape the empirical science and computational technologies for human-AI decision making.

Simulation Intelligence: Towards a New Generation of Scientific Methods Artificial Intelligence

The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science.