Goto

Collaborating Authors

 Decision Tree Learning


POETREE: Interpretable Policy Learning with Adaptive Decision Trees

arXiv.org Artificial Intelligence

Building models of human decision-making from observed behaviour is critical to better understand, diagnose and support real-world policies such as clinical care. As established policy learning approaches remain focused on imitation performance, they fall short of explaining the demonstrated decision-making process. Policy Extraction through decision Trees (POETREE) is a novel framework for interpretable policy learning, compatible with fully-offline and partially-observable clinical decision environments -- and builds probabilistic tree policies determining physician actions based on patients' observations and medical history. Fully-differentiable tree architectures are grown incrementally during optimization to adapt their complexity to the modelling task, and learn a representation of patient history through recurrence, resulting in decision tree policies that adapt over time with patient information. This policy learning method outperforms the state-of-the-art on real and synthetic medical datasets, both in terms of understanding, quantifying and evaluating observed behaviour as well as in accurately replicating it -- with potential to improve future decision support systems.


On Tackling Explanation Redundancy in Decision Trees

Journal of Artificial Intelligence Research

Decision trees (DTs) epitomize the ideal of interpretability of machine learning (ML) models. The interpretability of decision trees motivates explainability approaches by so-called intrinsic interpretability, and it is at the core of recent proposals for applying interpretable ML models in high-risk applications. The belief in DT interpretability is justified by the fact that explanations for DT predictions are generally expected to be succinct. Indeed, in the case of DTs, explanations correspond to DT paths. Since decision trees are ideally shallow, and so paths contain far fewer features than the total number of features, explanations in DTs are expected to be succinct, and hence interpretable. This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not be deemed interpretable. The paper introduces logically rigorous path explanations and path explanation redundancy, and proves that there exist functions for which decision trees must exhibit paths with explanation redundancy that is arbitrarily larger than the actual path explanation. The paper also proves that only a very restricted class of functions can be represented with DTs that exhibit no explanation redundancy. In addition, the paper includes experimental results substantiating that path explanation redundancy is observed ubiquitously in decision trees, including those obtained using different tree learning algorithms, but also in a wide range of publicly available decision trees. The paper also proposes polynomial-time algorithms for eliminating path explanation redundancy, which in practice require negligible time to compute. Thus, these algorithms serve to indirectly attain irreducible, and so succinct, explanations for decision trees. Furthermore, the paper includes novel results related with duality and enumeration of explanations, based on using SAT solvers as witness-producing NP-oracles.


Towards Explaining Autonomy with Verbalised Decision Tree States

arXiv.org Artificial Intelligence

The development of new AUV technology increased the range of tasks that AUVs can tackle and the length of their operations. As a result, AUVs are capable of handling highly complex operations. However, these missions do not fit easily into the traditional method of defining a mission as a series of pre-planned waypoints because it is not possible to know, in advance, everything that might occur during the mission. This results in a gap between the operator's expectations and actual operational performance. Consequently, this can create a diminished level of trust between the operators and AUVs, resulting in unnecessary mission interruptions. To bridge this gap between in-mission robotic behaviours and operators' expectations, this work aims to provide a framework to explain decisions and actions taken by an autonomous vehicle during the mission, in an easy-to-understand manner. Additionally, the objective is to have an autonomy-agnostic system that can be added as an additional layer on top of any autonomy architecture. To make the approach applicable across different autonomous systems equipped with different autonomies, this work decouples the inner workings of the autonomy from the decision points and the resulting executed actions applying Knowledge Distillation. Finally, to present the explanations to the operators in a more natural way, the output of the distilled decision tree is combined with natural language explanations and reported to the operators as sentences. For this reason, an additional step known as Concept2Text Generation is added at the end of the explanation pipeline.


CART (Classification And Regression Tree) in Machine Learning - GeeksforGeeks

#artificialintelligence

CART( Classification And Regression Tree) is a variation of the decision tree algorithm. It can handle both classification and regression tasks. Scikit-Learn uses the Classification And Regression Tree (CART) algorithm to train Decision Trees (also called "growing" trees). CART was first produced by Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone in 1984. CART is a predictive algorithm used in Machine learning and it explains how the target variable's values can be predicted based on other matters.


WeightedSHAP: analyzing and improving Shapley based feature attributions

arXiv.org Artificial Intelligence

Shapley value is a popular approach for measuring the influence of individual features. While Shapley feature attribution is built upon desiderata from game theory, some of its constraints may be less natural in certain machine learning settings, leading to unintuitive model interpretation. In particular, the Shapley value uses the same weight for all marginal contributions -- i.e. it gives the same importance when a large number of other features are given versus when a small number of other features are given. This property can be problematic if larger feature sets are more or less informative than smaller feature sets. Our work performs a rigorous analysis of the potential limitations of Shapley feature attribution. We identify simple settings where the Shapley value is mathematically suboptimal by assigning larger attributions for less influential features. Motivated by this observation, we propose WeightedSHAP, which generalizes the Shapley value and learns which marginal contributions to focus directly from data. On several real-world datasets, we demonstrate that the influential features identified by WeightedSHAP are better able to recapitulate the model's predictions compared to the features identified by the Shapley value.


What is Out Of Bag Error ?

#artificialintelligence

Out Of Bag Error is the number of wrongly classifying OOB sample. Random Forest is extension of decision tree bagging algorithm. Bagging algorithm use multiple algorithm to build model. Constructing many decision trees in random forest algorithm helps model to generalize the and learn data pattern. To select training sets for multiple tress bootstrapping helps.


Automatic Identification and Classification of Share Buybacks and their Effect on Short-, Mid- and Long-Term Returns

arXiv.org Artificial Intelligence

This thesis investigates share buybacks, specifically share buyback announcements. It addresses how to recognize such announcements, the excess return of share buybacks, and the prediction of returns after a share buyback announcement. We illustrate two NLP approaches for the automated detection of share buyback announcements. Even with very small amounts of training data, we can achieve an accuracy of up to 90%. This thesis utilizes these NLP methods to generate a large dataset consisting of 57,155 share buyback announcements. By analyzing this dataset, this thesis aims to show that most companies, which have a share buyback announced are underperforming the MSCI World. A minority of companies, however, significantly outperform the MSCI World. This significant overperformance leads to a net gain when looking at the averages of all companies. If the benchmark index is adjusted for the respective size of the companies, the average overperformance disappears, and the majority underperforms even greater. However, it was found that companies that announce a share buyback with a volume of at least 1% of their market cap, deliver, on average, a significant overperformance, even when using an adjusted benchmark. It was also found that companies that announce share buybacks in times of crisis emerge better than the overall market. Additionally, the generated dataset was used to train 72 machine learning models. Through this, it was able to find many strategies that could achieve an accuracy of up to 77% and generate great excess returns. A variety of performance indicators could be improved across six different time frames and a significant overperformance was identified. This was achieved by training several models for different tasks and time frames as well as combining these different models, generating significant improvement by fusing weak learners, in order to create one strong learner.


Quantifying the role of interest rates, the Dollar and Covid in oil prices

#artificialintelligence

Which are the key determinants of oil prices, and what role do financial factors play in Brent price formation? This paper sheds a new light on these fundamental questions relying on a widely used machine learning technique (random forests, based on 1,000 regression trees). As the article shows, the use of this technique leads to very large gains in oil price forecasting performance. Besides strong forecasting performance, this powerful data-driven method also uncovers how economic and financial variables relate to oil prices. The benchmark model relies on 11 explanatory variables, which are firmly grounded on economic theory and measured on a daily frequency.


Toward Intention Discovery for Early Malice Detection in Bitcoin

arXiv.org Artificial Intelligence

Bitcoin has been subject to illicit activities more often than probably any other financial assets, due to the pseudo-anonymous nature of its transacting entities. An ideal detection model is expected to achieve all the three properties of (I) early detection, (II) good interpretability, and (III) versatility for various illicit activities. However, existing solutions cannot meet all these requirements, as most of them heavily rely on deep learning without satisfying interpretability and are only available for retrospective analysis of a specific illicit type. First, we present asset transfer paths, which aim to describe addresses' early characteristics. Next, with a decision tree based strategy for feature selection and segmentation, we split the entire observation period into different segments and encode each as a segment vector. After clustering all these segment vectors, we get the global status vectors, essentially the basic unit to describe the whole intention. Finally, a hierarchical self-attention predictor predicts the label for the given address in real time. A survival module tells the predictor when to stop and proposes the status sequence, namely intention. % With the type-dependent selection strategy and global status vectors, our model can be applied to detect various illicit activities with strong interpretability. The well-designed predictor and particular loss functions strengthen the model's prediction speed and interpretability one step further. Extensive experiments on three real-world datasets show that our proposed algorithm outperforms state-of-the-art methods. Besides, additional case studies justify our model can not only explain existing illicit patterns but can also find new suspicious characters.


Leak Detection in Natural Gas Pipeline Using Machine Learning Models

arXiv.org Artificial Intelligence

Leak detection in gas pipelines is an important and persistent problem in the Oil and Gas industry. This is particularly important as pipelines are the most common way of transporting natural gas. This research aims to study the ability of data-driven intelligent models to detect small leaks for a natural gas pipeline using basic operational parameters and then compare the intelligent models among themselves using existing performance metrics. This project applies the observer design technique to detect leaks in natural gas pipelines using a regressoclassification hierarchical model where an intelligent model acts as a regressor and a modified logistic regression model acts as a classifier. Five intelligent models (gradient boosting, decision trees, random forest, support vector machine and artificial neural network) are studied in this project using a pipeline data stream of four weeks. The results shows that while support vector machine and artificial neural networks are better regressors than the others, they do not provide the best results in leak detection due to their internal complexities and the volume of data used. The random forest and decision tree models are the most sensitive as they can detect a leak of 0.1% of nominal flow in about 2 hours. All the intelligent models had high reliability with zero false alarm rate in testing phase. The average time to leak detection for all the intelligent models was compared to a real time transient model in literature. The results show that intelligent models perform relatively well in the problem of leak detection. This result suggests that intelligent models could be used alongside a real time transient model to significantly improve leak detection results.