Collaborating Authors

Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems using Feature Importance Fusion Machine Learning

When machine learning supports decision-making in safety-critical systems, it is important to verify and understand the reasons why a particular output is produced. Although feature importance calculation approaches assist in interpretation, there is a lack of consensus regarding how features' importance is quantified, which makes the explanations offered for the outcomes mostly unreliable. A possible solution to address the lack of agreement is to combine the results from multiple feature importance quantifiers to reduce the variance of estimates. Our hypothesis is that this will lead to more robust and trustworthy interpretations of the contribution of each feature to machine learning predictions. To assist test this hypothesis, we propose an extensible Framework divided in four main parts: (i) traditional data pre-processing and preparation for predictive machine learning models; (ii) predictive machine learning; (iii) feature importance quantification and (iv) feature importance decision fusion using an ensemble strategy. We also introduce a novel fusion metric and compare it to the state-of-the-art. Our approach is tested on synthetic data, where the ground truth is known. We compare different fusion approaches and their results for both training and test sets. We also investigate how different characteristics within the datasets affect the feature importance ensembles studied. Results show that our feature importance ensemble Framework overall produces 15% less feature importance error compared to existing methods. Additionally, results reveal that different levels of noise in the datasets do not affect the feature importance ensembles' ability to accurately quantify feature importance, whereas the feature importance quantification error increases with the number of features and number of orthogonal informative features.

Better Model Selection with a new Definition of Feature Importance Machine Learning

Feature importance aims at measuring how crucial each input feature is for model prediction. It is widely used in feature engineering, model selection and explainable artificial intelligence (XAI). In this paper, we propose a new tree-model explanation approach for model selection. Our novel concept leverages the Coefficient of Variation of a feature weight (measured in terms of the contribution of the feature to the prediction) to capture the dispersion of importance over samples. Extensive experimental results show that our novel feature explanation performs better than general cross validation method in model selection both in terms of time efficiency and accuracy performance.

Interpretable Machine Learning with an Ensemble of Gradient Boosting Machines Machine Learning

A method for the local and global interpretation of a black-box model on the basis of the well-known generalized additive models is proposed. It can be viewed as an extension or a modification of the algorithm using the neural additive model. The method is based on using an ensemble of gradient boosting machines (GBMs) such that each GBM is learned on a single feature and produces a shape function of the feature. The ensemble is composed as a weighted sum of separate GBMs resulting a weighted sum of shape functions which form the generalized additive model. GBMs are built in parallel using randomized decision trees of depth 1, which provide a very simple architecture. Weights of GBMs as well as features are computed in each iteration of boosting by using the Lasso method and then updated by means of a specific smoothing procedure. In contrast to the neural additive model, the method provides weights of features in the explicit form, and it is simply trained. A lot of numerical experiments with an algorithm implementing the proposed method on synthetic and real datasets demonstrate its efficiency and properties for local and global interpretation.

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics Machine Learning

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data is one of the most commonly used modes of data in diverse applications such as healthcare and finance. Much of the existing interpretability methods used for tabular data only report feature-importance scores -- either locally (per example) or globally (per model) -- but they do not provide interpretation or visualization of how the features interact. We address this limitation by introducing Feature Vectors, a new global interpretability method designed for tabular datasets. In addition to providing feature-importance, Feature Vectors discovers the inherent semantic relationship among features via an intuitive feature visualization technique. Our systematic experiments demonstrate the empirical utility of this new method by applying it to several real-world datasets. We further provide an easy-to-use Python package for Feature Vectors.

An Explainable-AI approach for Diagnosis of COVID-19 using MALDI-ToF Mass Spectrometry Artificial Intelligence

The novel severe acute respiratory syndrome coronavirus type-2 (SARS-CoV-2) caused a global pandemic that has taken more than 4.5 million lives and severely affected the global economy. To curb the spread of the virus, an accurate, cost-effective, and quick testing for large populations is exceedingly important in order to identify, isolate, and treat infected people. Current testing methods commonly use PCR (Polymerase Chain Reaction) based equipment that have limitations on throughput, cost-effectiveness, and simplicity of procedure which creates a compelling need for developing additional coronavirus disease-2019 (COVID-19) testing mechanisms, that are highly sensitive, rapid, trustworthy, and convenient to use by the public. We propose a COVID-19 testing method using artificial intelligence (AI) techniques on MALDI-ToF (matrix-assisted laser desorption/ionization time-of-flight) data extracted from 152 human gargle samples (60 COVID-19 positive tests and 92 COVID-19 negative tests). Our AI-based approach leverages explainable-AI (X-AI) methods to explain the decision rules behind the predictive algorithm both on a local (per-sample) and global (all-samples) basis to make the AI model more trustworthy. Finally, we evaluated our proposed method using a 70%-30% train-test-split strategy and achieved a training accuracy of 86.79% and a testing accuracy of 91.30%.