Overview
A Review of the Role of Causality in Developing Trustworthy AI Systems
Ganguly, Niloy, Fazlija, Dren, Badar, Maryam, Fisichella, Marco, Sikdar, Sandipan, Schrader, Johanna, Wallat, Jonas, Rudra, Koustav, Koubarakis, Manolis, Patro, Gourab K., Amri, Wadhah Zai El, Nejdl, Wolfgang
As a result, they are often brittle and unable to adapt to new domains, can treat individuals or subgroups unfairly, and have limited ability to explain their actions or recommendations [197, 235] reducing the trust of human users [118]. Following this, a new area of research, trustworthy AI, has recently received much attention from several policymakers and other regulatory organizations. The resulting guidelines (e.g., [184, 186, 187]), introduced to increase trust in AI systems, make developing trustworthy AI not only a technical (research) and social endeavor but also an organizational and (legal) obligational requirement. In this paper, we set out to demonstrate, through an extensive survey, that causal modeling and reasoning is an emerging and very useful tool for enabling current AI systems to become trustworthy. Causality is the science of reasoning about causes and effects. Cause-and-effect relationships are central to how we make sense of the world around us, how we act upon it, and how we respond to changes in our environment. In AI, research in causality was pioneered by the Turing award winner Judea Pearl long back in his 1995 seminal paper [194]. Since then, many researchers have contributed to the development of a solid mathematical basis for causality; see, for example, the books [79, 196, 201], the survey [90] and seminal papers [197, 235].
Few-shot learning approaches for classifying low resource domain specific software requirements
Nayak, Anmol, Timmapathini, Hari Prasad, Murali, Vidhya, Gohad, Atul Anil
With the advent of strong pre-trained natural language processing models like BERT, DeBERTa, MiniLM, T5, the data requirement for industries to fine-tune these models to their niche use cases has drastically reduced (typically to a few hundred annotated samples for achieving a reasonable performance). However, the availability of even a few hundred annotated samples may not always be guaranteed in low resource domains like automotive, which often limits the usage of such deep learning models in an industrial setting. In this paper we aim to address the challenge of fine-tuning such pre-trained models with only a few annotated samples, also known as Few-shot learning. Our experiments focus on evaluating the performance of a diverse set of algorithms and methodologies to achieve the task of classifying BOSCH automotive domain textual software requirements into 3 categories, while utilizing only 15 annotated samples per category for fine-tuning. We find that while SciBERT and DeBERTa based models tend to be the most accurate at 15 training samples, their performance improvement scales minimally as the number of annotated samples is increased to 50 in comparison to Siamese and T5 based models.
Multi-teacher knowledge distillation as an effective method for compressing ensembles of neural networks
Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have achieved great success, but the enormous computational complexity and gigantic storage requirements make it extremely difficult to implement them in real-time applications. On the other hand, the size of the dataset is still a real problem in many domains. Data are often missing, too expensive, or impossible to obtain for other reasons. Ensemble learning is partially a solution to the problem of small datasets and overfitting. However, ensemble learning in its basic version is associated with a linear increase in computational complexity. We analyzed the impact of the ensemble decision-fusion mechanism and checked various methods of sharing the decisions including voting algorithms. We used the modified knowledge distillation framework as a decision-fusion mechanism which allows in addition compressing of the entire ensemble model into a weight space of a single model. We showed that knowledge distillation can aggregate knowledge from multiple teachers in only one student model and, with the same computational complexity, obtain a better-performing model compared to a model trained in the standard manner. We have developed our own method for mimicking the responses of all teachers at the same time, simultaneously. We tested these solutions on several benchmark datasets. In the end, we presented a wide application use of the efficient multi-teacher knowledge distillation framework. In the first example, we used knowledge distillation to develop models that could automate corrosion detection on aircraft fuselage. The second example describes detection of smoke on observation cameras in order to counteract wildfires in forests.
Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances
Moser, Brian, Raue, Federico, Frolov, Stanislav, Hees, Jörn, Palacio, Sebastian, Dengel, Andreas
With the advent of Deep Learning (DL), Super-Resolution (SR) has also become a thriving research area. However, despite promising results, the field still faces challenges that require further research e.g., allowing flexible upsampling, more effective loss functions, and better evaluation metrics. We review the domain of SR in light of recent advances, and examine state-of-the-art models such as diffusion (DDPM) and transformer-based SR models. We present a critical discussion on contemporary strategies used in SR, and identify promising yet unexplored research directions. We complement previous surveys by incorporating the latest developments in the field such as uncertainty-driven losses, wavelet networks, neural architecture search, novel normalization methods, and the latests evaluation techniques. We also include several visualizations for the models and methods throughout each chapter in order to facilitate a global understanding of the trends in the field. This review is ultimately aimed at helping researchers to push the boundaries of DL applied to SR.
A Survey on Active Simultaneous Localization and Mapping: State of the Art and New Frontiers
Placed, Julio A., Strader, Jared, Carrillo, Henry, Atanasov, Nikolay, Indelman, Vadim, Carlone, Luca, Castellanos, José A.
Active Simultaneous Localization and Mapping (SLAM) is the problem of planning and controlling the motion of a robot to build the most accurate and complete model of the surrounding environment. Since the first foundational work in active perception appeared, more than three decades ago, this field has received increasing attention across different scientific communities. This has brought about many different approaches and formulations, and makes a review of the current trends necessary and extremely valuable for both new and experienced researchers. In this work, we survey the state-of-the-art in active SLAM and take an in-depth look at the open challenges that still require attention to meet the needs of modern applications. After providing a historical perspective, we present a unified problem formulation and review the well-established modular solution scheme, which decouples the problem into three stages that identify, select, and execute potential navigation actions. We then analyze alternative approaches, including belief-space planning and deep reinforcement learning techniques, and review related work on multi-robot coordination. The manuscript concludes with a discussion of new research directions, addressing reproducible research, active spatial perception, and practical applications, among other topics.
Evaluation of Word Embeddings for the Social Sciences
Schiffers, Ricardo, Kern, Dagmar, Hienert, Daniel
Word embeddings are an essential instrument in many NLP tasks. Most available resources are trained on general language from Web corpora or Wikipedia dumps. However, word embeddings for domain-specific language are rare, in particular for the social science domain. Therefore, in this work, we describe the creation and evaluation of word embedding models based on 37,604 open-access social science research papers. In the evaluation, we compare domain-specific and general language models for (i) language coverage, (ii) diversity, and (iii) semantic relationships. We found that the created domain-specific model, even with a relatively small vocabulary size, covers a large part of social science concepts, their neighborhoods are diverse in comparison to more general models. Across all relation types, we found a more extensive coverage of semantic relationships.
Review on Efficient Strategies for Coordinated Motion and Tracking in Swarm Robotics
Swarm robotics is a creative method of organizing multi-robot structures, consisting of many basic robots influenced by communal insects. The greatest astonishing attribute of swarm robots is their capacity to function together to accomplish a collective objective. This paper addresses the list of current surveys, problems and algorithms that were stimulated in the research of Coordinated Movement in Swarm robotics. Algorithms for swarm robotics movement are contrasted, considering the swarm micro-robots to accomplish aggregation, creation, and clamouring by contrasting the relative computational simulations between the algorithms and simulations used.
Towards Explainable Visual Anomaly Detection
Wang, Yizhou, Guo, Dongliang, Li, Sheng, Fu, Yun
Anomaly detection and localization of visual data, including images and videos, are of great significance in both machine learning academia and applied real-world scenarios. Despite the rapid development of visual anomaly detection techniques in recent years, the interpretations of these black-box models and reasonable explanations of why anomalies can be distinguished out are scarce. This paper provides the first survey concentrated on explainable visual anomaly detection methods. We first introduce the basic background of image-level anomaly detection and video-level anomaly detection, followed by the current explainable approaches for visual anomaly detection. Then, as the main content of this survey, a comprehensive and exhaustive literature review of explainable anomaly detection methods for both images and videos is presented. Finally, we discuss several promising future directions and open problems to explore on the explainability of visual anomaly detection.
Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models
Gregucci, Cosimo, Nayyeri, Mojtaba, Hernández, Daniel, Staab, Steffen
Predicting missing links between entities in a knowledge graph is a fundamental task to deal with the incompleteness of data on the Web. Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according to geometric criteria. Relations in the graph may follow patterns that can be learned, e.g., some relations might be symmetric and others might be hierarchical. However, the learning capability of different embedding models varies for each pattern and, so far, no single model can learn all patterns equally well. In this paper, we combine the query representations from several models in a unified one to incorporate patterns that are independently captured by each model. Our combination uses attention to select the most suitable model to answer each query. The models are also mapped onto a non-Euclidean manifold, the Poincar\'e ball, to capture structural patterns, such as hierarchies, besides relational patterns, such as symmetry. We prove that our combination provides a higher expressiveness and inference power than each model on its own. As a result, the combined model can learn relational and structural patterns. We conduct extensive experimental analysis with various link prediction benchmarks showing that the combined model outperforms individual models, including state-of-the-art approaches.
Evolution of SLAM: Toward the Robust-Perception of Autonomy
Simultaneous localisation and mapping (SLAM) is the problem of autonomous robots to construct or update a map of an undetermined unstructured environment while simultaneously estimate the pose in it. The current trend towards self-driving vehicles has influenced the development of robust SLAM techniques over the last 30 years. This problem is addressed by using a standard sensor or a sensor array (Ultrasonic sensor, LIDAR, Camera, Kinect RGB-D) with sensor fusion techniques to achieve the perception step. Sensing method is determined by considering the specifications of the environment to extract the features. Then the usage of classical Filter-based approaches, the global optimisation approach which is a popular method for visual-based SLAM and convolutional neural network-based methods such as deep learning-based SLAM are discussed whereas considering how to overcome the localisation and mapping issues. The robustness and scalability in long-term autonomy, performance and other new directions in the algorithms compared with each other to sort out. This paper is looking at the published previous work with a judgemental perspective from sensors to algorithm development while discussing open challenges and new research frontiers.