Overview
Graph Learning: A Survey
Xia, Feng, Sun, Ke, Yu, Shuo, Aziz, Abdul, Wan, Liangtian, Pan, Shirui, Liu, Huan
Graphs are widely used as a popular representation of the network structure of connected data. Graph data can be found in a broad spectrum of application domains such as social systems, ecosystems, biological networks, knowledge graphs, and information systems. With the continuous penetration of artificial intelligence technologies, graph learning (i.e., machine learning on graphs) is gaining attention from both researchers and practitioners. Graph learning proves effective for many tasks, such as classification, link prediction, and matching. Generally, graph learning methods extract relevant features of graphs by taking advantage of machine learning algorithms. In this survey, we present a comprehensive overview on the state-of-the-art of graph learning. Special attention is paid to four categories of existing graph learning methods, including graph signal processing, matrix factorization, random walk, and deep learning. Major models and algorithms under these categories are reviewed respectively. We examine graph learning applications in areas such as text, images, science, knowledge graphs, and combinatorial optimization. In addition, we discuss several promising research directions in this field.
Bayesian structure learning and sampling of Bayesian networks with the R package BiDAG
Suter, Polina, Kuipers, Jack, Moffa, Giusi, Beerenwinkel, Niko
A Bayesian network is a probabilistic graphical model, which represents conditional independence relationships between a set of random variables by a directed acyclic graph (DAG).The problem of DAG learning from observational data is hard (Chickering 1996), and the number of DAGs grows super-exponentially with the number of nodes. Hence, developing and implementing methods to learn an underlying DAG from observational data in reasonable time continues to be the focus of much research (Bartlett and Cussens 2017; Goudie and Mukherjee 2016; Scanagatta, de Campos, and Corani 2015). Drton and Maathuis (2017) provide an overview of the approaches for structure learning of graphical models including Bayesian networks. The R (R Development Core Team 2008) packages pcalg (Kalisch, Mächler, Colombo, Maathuis, and Bühlmann 2012), BNlearn (Scutari 2010), bnstruct (Franzin, Sambo, and Camillo 2017) and the Java-based toolbox TETRAD (Glymour, Scheines, Spirtes, and Ramsey 2017) implement multiple approaches to structure learning, including both constraint-based and searcharXiv:2105.00488v1
Visually grounded models of spoken language: A survey of datasets, architectures and evaluation techniques
This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of indirect and noisy clues, crucially including signals from the visual modality co-occurring with spoken utterances. Several fields have made important contributions to this approach to modeling or mimicking the process of learning language: Machine Learning, Natural Language and Speech Processing, Computer Vision and Cognitive Science. The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. We discuss the central research questions addressed, the timeline of developments, and the datasets which enabled much of this work. We then summarize the main modeling architectures and offer an exhaustive overview of the evaluation metrics and analysis techniques.
Simulation: Cutting the Corner on Machine Learning
As the offshore oil and gas industry becomes more competitive, it actively pursues increased efficiency through innovative approaches while streamlining production, reducing costs, and improving safety. Many companies are looking at digitization to insulate themselves from market shocks, remain profitable at lower oil prices, and generate competitive advantage during recovery. The path forward lies in leveraging machine learning-based technologies that are maturing quickly and are being adopted across the value chain. The use of Machine Learning (ML) models is particularly promising for the resolution of problems involving processes that are not completely understood or where it is not feasible to run mechanistic models at desired resolutions in space and time. With these growing technologies and solutions to complex science and engineering problems require novel methodologies that can integrate physics-based modeling approaches with state-of-the-art ML techniques.
Geometric foundations of Deep Learning
In October 1872, the philosophy faculty of a small university in the Bavarian city of Erlangen appointed a new young professor. As customary, he was requested to deliver an inaugural research programme, which he published under the somewhat long and boring title Vergleichende Betrachtungen über neuere geometrische Forschungen ("A comparative review of recent researches in geometry"). The professor was Felix Klein, only 23 years of age at that time, and his inaugural work has entered the annals of mathematics as the "Erlangen Programme" [1]. The nineteenth century had been remarkably fruitful for geometry. For the first time in nearly two thousand years after Euclid, the construction of projective geometry by Poncelet, hyperbolic geometry by Gauss, Bolyai, and Lobachevsky, and elliptic geometry by Riemann showed that an entire zoo of diverse geometries was possible.
Labeled Bipolar Argumentation Frameworks
Escañuela Gonzalez, Melisa G. (Conasejo Nacional de Investigaciones Científicas y Técnicas (CONICET) - Universidad Nacional de Santiago del Estero (UNSE)) | Budán, Maximiliano C. D. | Simari, Gerardo I. (Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) - Universidad Nacional del Sur (UNS)) | Simari, Guillermo R. (Universidad Nacional del Sur (UNS))
An essential part of argumentation-based reasoning is to identify arguments in favor and against a statement or query, select the acceptable ones, and then determine whether or not the original statement should be accepted. We present here an abstract framework that considers two independent forms of argument interaction--support and conflict--and is able to represent distinctive information associated with these arguments. This information can enable additional actions such as: (i) a more in-depth analysis of the relations between the arguments; (ii) a representation of the user's posture to help in focusing the argumentative process, optimizing the values of attributes associated with certain arguments; and (iii) an enhancement of the semantics taking advantage of the availability of richer information about argument acceptability. Thus, the classical semantic definitions are enhanced by analyzing a set of postulates they satisfy. Finally, a polynomial-time algorithm to perform the labeling process is introduced, in which the argument interactions are considered.
Ontology-based Feature Selection: A Survey
Sikelis, Konstantinos, Tsekouras, George E, Kotis, Konstantinos I
The Semantic Web emerged as an extension to the traditional Web, towards adding meaning to a distributed Web of structured and linked data. At its core, the concept of ontology provides the means to semantically describe and structure information and data and expose it to software and human agents in a machine and human-readable form. For software agents to be realized, it is crucial to develop powerful artificial intelligence and machine learning techniques, able to extract knowledge from information and data sources and represent it in the underlying ontology. This survey aims to provide insight into key aspects of ontology-based knowledge extraction, from various sources such as text, images, databases and human expertise, with emphasis on the task of feature selection. First, some of the most common classification and feature selection algorithms are briefly presented. Then, selected methodologies, which utilize ontologies to represent features and perform feature selection and classification, are described. The presented examples span diverse application domains, e.g., medicine, tourism, mechanical and civil engineering, and demonstrate the feasibility and applicability of such methods.
Explanation-Based Human Debugging of NLP Models: A Survey
Lertvittayakumjorn, Piyawat, Toni, Francesca
It is (2017) considered bugs as implementation errors, gaining more and more attention these days since similar to software bugs, while Cadamuro et al. explanations are necessary in several applications, (2016) defined a bug as a particularly damaging especially in high-stake domains such as healthcare, or inexplicable test error. In this paper, we follow law, transportation, and finance (Adadi and the definition of (model) bugs from Adebayo Berrada, 2018). Some researchers have explored et al. (2020) as contamination in the learning and/or various merits of explanations to humans, such as prediction pipeline that makes the model produce supporting human decision makings (Lai and Tan, incorrect predictions or learn error-causing associations.
Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety
Houben, Sebastian, Abrecht, Stephanie, Akila, Maram, Bär, Andreas, Brockherde, Felix, Feifel, Patrick, Fingscheidt, Tim, Gannamaneni, Sujan Sai, Ghobadi, Seyed Eghbal, Hammam, Ahmed, Haselhoff, Anselm, Hauser, Felix, Heinzemann, Christian, Hoffmann, Marco, Kapoor, Nikhil, Kappel, Falk, Klingner, Marvin, Kronenberger, Jan, Küppers, Fabian, Löhdefink, Jonas, Mlynarski, Michael, Mock, Michael, Mualla, Firas, Pavlitskaya, Svetlana, Poretschkin, Maximilian, Pohl, Alexander, Ravi-Kumar, Varun, Rosenzweig, Julia, Rottmann, Matthias, Rüping, Stefan, Sämann, Timo, Schneider, Jan David, Schulz, Elena, Schwalbe, Gesina, Sicking, Joachim, Srivastava, Toshika, Varghese, Serin, Weber, Michael, Wirkert, Sebastian, Wirtz, Tim, Woehrle, Matthias
The use of deep neural networks (DNNs) in safety-critical applications like mobile health and autonomous driving is challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability to problems with malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from safety concerns. In recent years, a zoo of state-of-the-art techniques aiming to address these safety concerns has emerged. This work provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our paper addresses both machine learning experts and safety engineers: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern ML methods. We moreover hope that our contribution fuels discussions on desiderata for ML systems and strategies on how to propel existing approaches accordingly.
The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling
Dunbar, Ewan, Bernard, Mathieu, Hamilakis, Nicolas, Nguyen, Tu Anh, de Seyssel, Maureen, Rozé, Patricia, Rivière, Morgane, Kharitonov, Eugene, Dupoux, Emmanuel
We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels. The challenge is based on the Libri-light dataset, which provides up to 60k hours of audio from English audio books without any associated text. We provide a pipeline baseline system consisting on an encoder based on contrastive predictive coding (CPC), a quantizer ($k$-means) and a standard language model (BERT or LSTM). The metrics evaluate the learned representations at the acoustic (ABX discrimination), lexical (spot-the-word), syntactic (acceptability judgment) and semantic levels (similarity judgment). We present an overview of the eight submitted systems from four groups and discuss the main results.