Goto

Collaborating Authors

 dalex


XAI-based Feature Ensemble for Enhanced Anomaly Detection in Autonomous Driving Systems

Nazat, Sazid, Abdallah, Mustafa

arXiv.org Artificial Intelligence

The rapid advancement of autonomous vehicle (AV) technology has introduced significant challenges in ensuring transportation security and reliability. Traditional AI models for anomaly detection in AVs are often opaque, posing difficulties in understanding and trusting their decision making processes. This paper proposes a novel feature ensemble framework that integrates multiple Explainable AI (XAI) methods: SHAP, LIME, and DALEX with various AI models to enhance both anomaly detection and interpretability. By fusing top features identified by these XAI methods across six diverse AI models (Decision Trees, Random Forests, Deep Neural Networks, K Nearest Neighbors, Support Vector Machines, and AdaBoost), the framework creates a robust and comprehensive set of features critical for detecting anomalies. These feature sets, produced by our feature ensemble framework, are evaluated using independent classifiers (CatBoost, Logistic Regression, and LightGBM) to ensure unbiased performance. We evaluated our feature ensemble approach on two popular autonomous driving datasets (VeReMi and Sensor) datasets. Our feature ensemble technique demonstrates improved accuracy, robustness, and transparency of AI models, contributing to safer and more trustworthy autonomous driving systems.


DALex: Lexicase-like Selection via Diverse Aggregation

Ni, Andrew, Ding, Li, Spector, Lee

arXiv.org Artificial Intelligence

Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.


Performance is not enough: the story told by a Rashomon quartet

Biecek, Przemyslaw, Baniecki, Hubert, Krzyzinski, Mateusz, Cook, Dianne

arXiv.org Machine Learning

Predictive modelling is often reduced to finding the best model that optimizes a selected performance measure. But what if the second-best model describes the data in a completely different way? What about the third-best? Is it possible that the equally effective models describe different relationships in the data? Inspired by Anscombe's quartet, this paper introduces a Rashomon quartet, a four models built on synthetic dataset which have practically identical predictive performance. However, their visualization reveals distinct explanations of the relation between input variables and the target variable. The illustrative example aims to encourage the use of visualization to compare predictive models beyond their performance.


GitHub - ModelOriented/DALEX: moDel Agnostic Language for Exploration and eXplanation

#artificialintelligence

Unverified black box model is the path to the failure. The DALEX package xrays any model and helps to explore and explain its behaviour, helps to understand how complex models are working. The main function explain() creates a wrapper around a predictive model. Wrapped models may then be explored and compared with a collection of local and global explainers. The philosophy behind DALEX explanations is described in the Explanatory Model Analysis e-book.


dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python

Baniecki, Hubert, Kretowicz, Wojciech, Piatyszek, Piotr, Wisniewski, Jakub, Biecek, Przemyslaw

arXiv.org Machine Learning

Their black-box nature leads to opaqueness debt phenomenon inflicting increased risks of discrimination, lack of reproducibility, and deflated performance due to data drift. To manage these risks, good MLOps practices ask for better validation of model performance and fairness, higher explainability, and continuous monitoring. The necessity of deeper model transparency appears not only from scientific and social domains, but also emerging laws and regulations on artificial intelligence. To facilitate the development of responsible machine learning models, we showcase dalex, a Python package which implements the model-agnostic interface for interactive model exploration. It adopts the design crafted through the development of various tools for responsible machine learning; thus, it aims at the unification of the existing solutions. This library's source code and documentation are available under open license at https://python.drwhy.ai/.


tensorflow + dalex = :) , or how to explain a TensorFlow model - KDnuggets

#artificialintelligence

I will showcase how straightforward and convenient it is to explain a tensorflow predictive model using the dalex Python package. The introduction to this topic can be found in Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models. For this example, we will use the data from the World Happiness Report and predict the happiness scored according to economic production, social support, etc., for any given country. Let's first train the basic tensorflow model incorporating the experimental normalization layer for a better fit. The next step is to create a dalex Explainer object, which takes model and data as input.


How to easily check if your ML model is fair?

#artificialintelligence

We live in a world that is getting more divided each day. In some parts of the world, the differences and inequalities between races, ethnicities, and sometimes sexes are aggravating. The data we use for modeling is in the major part a reflection of the world it derives from. And the world can be biased, so data and therefore model will likely reflect that. We propose a way in which ML engineers can easily check if their model is biased.


"Please, explain." Interpretability of machine learning models

#artificialintelligence

In February 2019 Polish government added an amendment to a banking law that gives a customer a right to receive an explanation in case of a negative credit decision. This means that a bank needs to be able to explain why the loan wasn't granted if the decision process was automatic. In October 2018 world headlines reported about Amazon AI recruiting tool that favored men. Amazon's model was trained on biased data that were skewed towards male candidates. It has built rules that penalized résumés that included the word "women's".


Interpretable Machine Learning Algorithms with Dalex and H2O

#artificialintelligence

As advanced machine learning algorithms are gaining acceptance across many organizations and domains, machine learning interpretability is growing in importance to help extract insight and clarity regarding how these algorithms are performing and why one prediction is made over another. There are many methodologies to interpret machine learning results (i.e. However, some recent R packages that focus purely on ML interpretability agnostic to any specific ML algorithm are gaining popularity. One such package is DALEX and this post covers what this package does (and does not do) so that you can determine if it should become part of your preferred machine learning toolbox. We implement machine learning models using H2O, a high performance ML toolkit. Let's see how DALEX and H2O work together to get the best of both worlds with high performance and feature explainability!


DALEX: explainers for complex predictive models

Biecek, Przemyslaw

arXiv.org Artificial Intelligence

Predictive modeling is invaded by elastic, yet complex methods such as neural networks or ensembles (model stacking, boosting or bagging). Such methods are usually described by a large number of parameters or hyper parameters - a price that one needs to pay for elasticity. The very number of parameters makes models hard to understand. This paper describes a consistent collection of explainers for predictive models, a.k.a. black boxes. Each explainer is a technique for exploration of a black box model. Presented approaches are model-agnostic, what means that they extract useful information from any predictive method despite its internal structure. Each explainer is linked with a specific aspect of a model. Some are useful in decomposing predictions, some serve better in understanding performance, while others are useful in understanding importance and conditional responses of a particular variable. Every explainer presented in this paper works for a single model or for a collection of models. In the latter case, models can be compared against each other. Such comparison helps to find strengths and weaknesses of different approaches and gives additional possibilities for model validation. Presented explainers are implemented in the DALEX package for R. They are based on a uniform standardized grammar of model exploration which may be easily extended. The current implementation supports the most popular frameworks for classification and regression.