papadopoulo
Weighted Embeddings for Low-Dimensional Graph Representation
Bläsius, Thomas, von der Heydt, Jean-Pierre, Katzmann, Maximilian, Maas, Nikolai
Learning low-dimensional numerical representations from symbolic data, e.g., embedding the nodes of a graph into a geometric space, is an important concept in machine learning. While embedding into Euclidean space is common, recent observations indicate that hyperbolic geometry is better suited to represent hierarchical information and heterogeneous data (e.g., graphs with a scale-free degree distribution). Despite their potential for more accurate representations, hyperbolic embeddings also have downsides like being more difficult to compute and harder to use in downstream tasks. We propose embedding into a weighted space, which is closely related to hyperbolic geometry but mathematically simpler. We provide the embedding algorithm WEmbed and demonstrate, based on generated as well as over 2000 real-world graphs, that our weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions. The running time of WEmbed and embedding quality for the remaining instances is on par with state-of-the-art Euclidean embedders.
Towards Systematic Monolingual NLP Surveys: GenA of Greek NLP
Bakagianni, Juli, Pouli, Kanella, Gavriilidou, Maria, Pavlopoulos, John
Natural Language Processing (NLP) research has traditionally been predominantly focused on English, driven by the availability of resources, the size of the research community, and market demands. Recently, there has been a noticeable shift towards multilingualism in NLP, recognizing the need for inclusivity and effectiveness across diverse languages and cultures. Monolingual surveys have the potential to complement the broader trend towards multilingualism in NLP by providing foundational insights and resources necessary for effectively addressing the linguistic diversity of global communication. However, monolingual NLP surveys are extremely rare in literature. This study fills the gap by introducing a method for creating systematic and comprehensive monolingual NLP surveys. Characterized by a structured search protocol, it can be used to select publications and organize them through a taxonomy of NLP tasks. We include a classification of Language Resources (LRs), according to their availability, and datasets, according to their annotation, to highlight publicly-available and machine-actionable LRs. By applying our method, we conducted a systematic literature review of Greek NLP from 2012 to 2022, providing a comprehensive overview of the current state and challenges of Greek NLP research. We discuss the progress of Greek NLP and outline encountered Greek LRs, classified by availability and usability. As we show, our proposed method helps avoid common pitfalls, such as data leakage and contamination, and to assess language support per NLP task. We consider this systematic literature review of Greek NLP an application of our method that showcases the benefits of a monolingual NLP survey. Similar applications could be regard the myriads of languages whose progress in NLP lags behind that of well-supported languages.
Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees
Jaber, Edgar, Blot, Vincent, Brunel, Nicolas, Chabridon, Vincent, Remy, Emmanuel, Iooss, Bertrand, Lucor, Didier, Mougeot, Mathilde, Leite, Alessandro
Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaussianity of the simulation model as well as the well-specification of the priors which are not always appropriate. We propose to address this issue with the help of conformal prediction. In the present work, a method for building adaptive cross-conformal prediction intervals is proposed by weighting the non-conformity score with the posterior standard deviation of the GP. The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets and display a significant correlation with the surrogate model local approximation error, while being free from the underlying model assumptions and having frequentist coverage guarantees. These estimators can thus be used for evaluating the quality of a GP surrogate model and can assist a decision-maker in the choice of the best prior for the specific application of the GP. The performance of the method is illustrated through a panel of numerical examples based on various reference databases. Moreover, the potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.
Reliable Probabilistic Classification with Neural Networks
They have been applied to a great variety of problems and fields with very good results. However, most machine learning techniques do not provide any indication about the uncertainty of each of their predictions, which would have been very beneficial for most applications and especially for risk sensitive settings such as medical diagnosis [1]. An indication of the likelihood of each prediction being correct notifies the user of a system about how much he can rely on each prediction and enables him to take more informed decisions. A solution to this problem was given by a recently developed machine learning theory called Conformal Prediction (CP) [2]. CP can be used for extending traditional machine learning algorithms and developing methods (called Conformal Predictors) whose predictions are guaranteed to satisfy a given level of confidence without assuming anything more than that the data are independently and identically distributed (i.i.d.). More specifically, CPs produce as their predictions a set containing all the possible classifications needed to satisfy the required confidence level. To date many different CPs have been developed, see e.g.
Guaranteed Coverage Prediction Intervals with Gaussian Process Regression
Gaussian Process Regression (GPR) is a popular regression method, which unlike most Machine Learning techniques, provides estimates of uncertainty for its predictions. These uncertainty estimates however, are based on the assumption that the model is well-specified, an assumption that is violated in most practical applications, since the required knowledge is rarely available. As a result, the produced uncertainty estimates can become very misleading; for example the prediction intervals (PIs) produced for the 95\% confidence level may cover much less than 95\% of the true labels. To address this issue, this paper introduces an extension of GPR based on a Machine Learning framework called, Conformal Prediction (CP). This extension guarantees the production of PIs with the required coverage even when the model is completely misspecified. The proposed approach combines the advantages of GPR with the valid coverage guarantee of CP, while the performed experimental results demonstrate its superiority over existing methods.
Diversity-aware social robots meet people: beyond context-aware embodied AI
Recchiuto, Carmine, Sgorbissa, Antonio
Carmine Recchiuto, Antonio Sgorbissa Introduction Mayra is a 34-year-old woman from Sri Lanka who arrived in Genoa in 2020, just before the COVID-19 pandemic. Mayra spends her days taking care of her three children and doing housework. Due to the lockdown measures, she had few opportunities to develop relationships with Italian people, so her Italian has remained very basic. The situation did not improve until December 2021 because finding a job was challenging due to the remaining COVID restriction. In January 2022, her husband bought a small robot, and Mayra called it "Dhvija."
A Deep Learning Framework for Wind Turbine Repair Action Prediction Using Alarm Sequences and Long Short Term Memory Algorithms
Walker, Connor, Rothon, Callum, Aslansefat, Koorosh, Papadopoulos, Yiannis, Dethlefs, Nina
With an increasing emphasis on driving down the costs of Operations and Maintenance (O&M) in the Offshore Wind (OSW) sector, comes the requirement to explore new methodology and applications of Deep Learning (DL) to the domain. Condition-based monitoring (CBM) has been at the forefront of recent research developing alarm-based systems and data-driven decision making. This paper provides a brief insight into the research being conducted in this area, with a specific focus on alarm sequence modelling and the associated challenges faced in its implementation. The paper proposes a novel idea to predict a set of relevant repair actions from an input sequence of alarm sequences, comparing Long Short-term Memory (LSTM) and Bidirectional LSTM (biLSTM) models. Achieving training accuracy results of up to 80.23%, and test accuracy results of up to 76.01% with biLSTM gives a strong indication to the potential benefits of the proposed approach that can be furthered in future research. The paper introduces a framework that integrates the proposed approach into O$\&$M procedures and discusses the potential benefits which include the reduction of a confusing plethora of alarms, as well as unnecessary vessel transfers to the turbines for fault diagnosis and correction.
SafeDrones: Real-Time Reliability Evaluation of UAVs using Executable Digital Dependable Identities
Aslansefat, Koorosh, Nikolaou, Panagiota, Walker, Martin, Akram, Mohammed Naveed, Sorokos, Ioannis, Reich, Jan, Kolios, Panayiotis, Michael, Maria K., Theocharides, Theocharis, Ellinas, Georgios, Schneider, Daniel, Papadopoulos, Yiannis
The use of Unmanned Arial Vehicles (UAVs) offers many advantages across a variety of applications. However, safety assurance is a key barrier to widespread usage, especially given the unpredictable operational and environmental factors experienced by UAVs, which are hard to capture solely at design-time. This paper proposes a new reliability modeling approach called SafeDrones to help address this issue by enabling runtime reliability and risk assessment of UAVs. It is a prototype instantiation of the Executable Digital Dependable Identity (EDDI) concept, which aims to create a model-based solution for real-time, data-driven dependability assurance for multi-robot systems. By providing real-time reliability estimates, SafeDrones allows UAVs to update their missions accordingly in an adaptive manner.
AI Squared raises $6M to help integrate AI into existing apps – TechCrunch
Integration platform AI Squared announced today the closing of a $6 million seed round led by NEA with participation from Ridgeline Partners. Launched in 2021, AI Squared helps companies adopt artificial intelligence by using a low-code platform to integrate it into existing applications in a timely and straightforward manner. Its founder, Benjamin Harvey, was inspired to start the company after a decade at the U.S. National Security Agency, where he saw how it and other organizations struggled to adopt artificial intelligence into existing applications. The struggle came from what's known as the last-mile challenge, which refers to the costly and time-consuming process of implementing an AI model within an application used on a day-to-day basis, like Netflix's program recommender system, he told TechCrunch. AI Squared helps solve the last-mile problem, assisting companies in adopting AI by using a low-code platform to integrate it into existing applications.
Evaluating brain MRI scans with the help of artificial intelligence
Greece is just one example of a population where the share of older people is expanding, and with it the incidences of neurodegenerative diseases. Among these, Alzheimer's disease is the most prevalent, accounting for 70% of neurodegenerative disease cases in Greece. According to estimates published by the Alzheimer Society of Greece, 197,000 people are suffering from the disease at present. This number is expected to rise to 354,000 by 2050. Dr. Andreas Papadopoulos1, a physician and scientific coordinator at Iatropolis Medical Group, a leading diagnostic provider near Athens, Greece, explains the key role of early diagnosis: "The likelihood of developing Alzheimer's may be only 1% to 2% at age 65. But then it doubles every five years. Existing drugs cannot reverse the course of the degeneration; they can only slow it down. This is why it's crucial to make the right diagnosis in the preliminary stages--when the first mild cognitive disorder appears--and to filter out Alzheimer's patients2."