Goto

Collaborating Authors

 Overview


A Review of SHACL: From Data Validation to Schema Reasoning for RDF Graphs

arXiv.org Artificial Intelligence

We present an introduction and a review of the Shapes Constraint Language (SHACL), the W3C recommendation language for validating RDF data. A SHACL document describes a set of constraints on RDF nodes, and a graph is valid with respect to the document if its nodes satisfy these constraints. We revisit the basic concepts of the language, its constructs and components and their interaction. We review the different formal frameworks used to study this language and the different semantics proposed. We examine a number of related problems, from containment and satisfiability to the interaction of SHACL with inference rules, and exhibit how different modellings of the language are useful for different problems. We also cover practical aspects of SHACL, discussing its implementations and state of adoption, to present a holistic review useful to practitioners and theoreticians alike.


LOGEN: Few-shot Logical Knowledge-Conditioned Text Generation with Self-training

arXiv.org Artificial Intelligence

Natural language generation from structured data mainly focuses on surface-level descriptions, suffering from uncontrollable content selection and low fidelity. Previous works leverage logical forms to facilitate logical knowledge-conditioned text generation. Though achieving remarkable progress, they are data-hungry, which makes the adoption for real-world applications challenging with limited data. To this end, this paper proposes a unified framework for logical knowledge-conditioned text generation in the few-shot setting. With only a few seeds logical forms (e.g., 20/100 shot), our approach leverages self-training and samples pseudo logical forms based on content and structure consistency. Experimental results demonstrate that our approach can obtain better few-shot performance than baselines.


A Brief Overview of Methods to Explain AI (XAI)

#artificialintelligence

I know this topic has been discussed many times. But I recently gave some talks on interpretability (for SCAI and France Innovation) and thought it would be good to include some of my work in this article. The importance of explainability for the decision-making process in machine learning doesn't need to be proved any longer. Users are demanding more explanations, and although there are no uniform and strict definitions of interpretability and explainability, the number of scientific papers explaining artificial intelligence (or XAI) is growing exponentially. As you may know, there are two ways to design an interpretable machine learning process.


A Survey on Scenario-Based Testing for Automated Driving Systems in High-Fidelity Simulation

arXiv.org Artificial Intelligence

Automated Driving Systems (ADSs) have seen rapid progress in recent years. To ensure the safety and reliability of these systems, extensive testings are being conducted before their future mass deployment. Testing the system on the road is the closest to real-world and desirable approach, but it is incredibly costly. Also, it is infeasible to cover rare corner cases using such real-world testing. Thus, a popular alternative is to evaluate an ADS's performance in some well-designed challenging scenarios, a.k.a. scenario-based testing. High-fidelity simulators have been widely used in this setting to maximize flexibility and convenience in testing what-if scenarios. Although many works have been proposed offering diverse frameworks/methods for testing specific systems, the comparisons and connections among these works are still missing. To bridge this gap, in this work, we provide a generic formulation of scenario-based testing in high-fidelity simulation and conduct a literature review on the existing works. We further compare them and present the open challenges as well as potential future research directions.


Unsupervised detection and open-set classification of fast-ramped flexibility activation events

arXiv.org Machine Learning

The continuous electrification of the mobility and heating sectors adds much-needed flexibility to the power system. However, flexibility utilization also introduces new challenges to distribution system operators (DSOs), who need mechanisms to supervise flexibility activations and monitor their effect on distribution network operation. Flexibility activations can be broadly categorized to those originating from electricity markets and those initiated by the DSO to avoid constraint violations. Simultaneous electricity market driven flexibility activations may cause voltage quality or temporary overloading issues, and the failure of flexibility activations initiated by the DSO might leave critical grid states unresolved. This work proposes a novel data processing pipeline for automated real-time identification of fast-ramped flexibility activation events. Its practical value is twofold: i) potentially critical flexibility activations originating from electricity markets can be detected by the DSO at an early stage, and ii) successful activation of DSO-requested flexibility can be verified by the operator. In both cases the increased awareness would allow the DSO to take counteractions to avoid potentially critical grid situations. The proposed pipeline combines techniques from unsupervised detection and open-set classification. For both building blocks feasibility is systematically evaluated and proofed on real load and flexibility activation data.


Outlier Detection using AI: A Survey

arXiv.org Artificial Intelligence

An outlier is an event or observation that is defined as an unusual activity, intrusion, or a suspicious data point that lies at an irregular distance from a population. The definition of an outlier event, however, is subjective and depends on the application and the domain (Energy, Health, Wireless Network, etc.). It is important to detect outlier events as carefully as possible to avoid infrastructure failures because anomalous events can cause minor to severe damage to infrastructure. For instance, an attack on a cyber-physical system such as a microgrid may initiate voltage or frequency instability, thereby damaging a smart inverter which involves very expensive repairing. Unusual activities in microgrids can be mechanical faults, behavior changes in the system, human or instrument errors or a malicious attack. Accordingly, and due to its variability, Outlier Detection (OD) is an ever-growing research field. In this chapter, we discuss the progress of OD methods using AI techniques. For that, the fundamental concepts of each OD model are introduced via multiple categories. Broad range of OD methods are categorized into six major categories: Statistical-based, Distance-based, Density-based, Clustering-based, Learning-based, and Ensemble methods. For every category, we discuss recent state-of-the-art approaches, their application areas, and performances. After that, a brief discussion regarding the advantages, disadvantages, and challenges of each technique is provided with recommendations on future research directions. This survey aims to guide the reader to better understand recent progress of OD methods for the assurance of AI.


A Comprehensive Survey on the Convergence of Vehicular Social Networks and Fog Computing

arXiv.org Artificial Intelligence

In recent years, the number of IoT devices has been growing fast which leads to a challenging task for managing, storing, analyzing, and making decisions about raw data from different IoT devices, especially for delay-sensitive applications. In a vehicular network (VANET) environment, the dynamic nature of vehicles makes the current open research issues even more challenging due to the frequent topology changes that can lead to disconnections between vehicles. To this end, a number of research works have been proposed in the context of cloud and fog computing over the 5G infrastructure. On the other hand, there are a variety of research proposals that aim to extend the connection time between vehicles. Vehicular Social Networks (VSNs) have been defined to decrease the burden of connection time between the vehicles. This survey paper first provides the necessary background information and definitions about fog, cloud and related paradigms such as 5G and SDN. Then, it introduces the reader to Vehicular Social Networks, the different metrics and the main differences between VSNs and Online Social Networks. Finally, this survey investigates the related works in the context of VANETs that have demonstrated different architectures to address the different issues in fog computing. Moreover, it provides a categorization of the different approaches and discusses the required metrics in the context of fog and cloud and compares them to Vehicular social networks. A comparison of the relevant related works is discussed along with new research challenges and trends in the domain of VSNs and fog computing.


US-Rule: Discovering Utility-driven Sequential Rules

arXiv.org Artificial Intelligence

Utility-driven mining is an important task in data science and has many applications in real life. High utility sequential pattern mining (HUSPM) is one kind of utility-driven mining. HUSPM aims to discover all sequential patterns with high utility. However, the existing algorithms of HUSPM can not provide an accurate probability to deal with some scenarios for prediction or recommendation. High-utility sequential rule mining (HUSRM) was proposed to discover all sequential rules with high utility and high confidence. There is only one algorithm proposed for HUSRM, which is not enough efficient. In this paper, we propose a faster algorithm, called US-Rule, to efficiently mine high-utility sequential rules. It utilizes rule estimated utility co-occurrence pruning strategy (REUCP) to avoid meaningless computation. To improve the efficiency on dense and long sequence datasets, four tighter upper bounds (LEEU, REEU, LERSU, RERSU) and their corresponding pruning strategies (LEEUP, REEUP, LERSUP, RERSUP) are proposed. Besides, US-Rule proposes rule estimated utility recomputing pruning strategy (REURP) to deal with sparse datasets. At last, a large number of experiments on different datasets compared to the state-of-the-art algorithm demonstrate that US-Rule can achieve better performance in terms of execution time, memory consumption and scalability.


A Brief Overview of Methods to Explain AI (XAI)

#artificialintelligence

I know this topic has been discussed many times. But I recently gave some talks on interpretability (for SCAI and France Innovation) and thought it would be good to include some of my work in this article. The importance of explainability for the decision-making process in machine learning doesn't need to be proved any longer. Users are demanding more explanations, and although there are no uniform and strict definitions of interpretability and explainability, the number of scientific papers explaining artificial intelligence (or XAI) is growing exponentially. As you may know, there are two ways to design an interpretable machine learning process.


An Introduction to Deep Long-Tailed Learning

#artificialintelligence

This survey by Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan and Jiashi Feng covers the following topic in far grater detail and I highly recommend checking it out for a more thorough discussion the ideas discussed in this article. With the massive success of Deep Learning in the field of image recognition comes the need to apply these techniques to solve real-world problems. An issue that arises here, however, is that in real world applications, training samples typically a long-tailed class distribution, where a small portion of classes have massive sample points but the others are associated with only a few samples. Thus, a model can be easily biased towards the head classes, resulting in a poor performance on the tail classes [1]. Many methods to counter such class imbalances in the data have been studied, mostly grouped within 3 categories.