FOLD-SE: An Efficient Rule-based Machine Learning Algorithm with Scalable Explainability
–arXiv.org Artificial Intelligence
We present FOLD-SE, an efficient, explainable machine learning algorithm for classification tasks given tabular data containing numerical and categorical values. FOLD-SE generates a set of default rules-essentially a stratified normal logic program-as an (explainable) trained model. Explainability provided by FOLD-SE is scalable, meaning that regardless of the size of the dataset, the number of learned rules and learned literals stay quite small while good accuracy in classification is maintained. A model with smaller number of rules and literals is easier to understand for human beings. FOLD-SE is competitive with state-of-the-art machine learning algorithms such as XGBoost and Multi-Layer Perceptrons (MLP) wrt accuracy of prediction. However, unlike XGBoost and MLP, the FOLD-SE algorithm is explainable. The FOLD-SE algorithm builds upon our earlier work on developing the explainable FOLD-R++ machine learning algorithm for binary classification and inherits all of its positive features. Thus, pre-processing of the dataset, using techniques such as one-hot encoding, is not needed. Like FOLD-R++, FOLD-SE uses prefix sum to speed up computations resulting in FOLD-SE being an order of magnitude faster than XGBoost and MLP in execution speed. The FOLD-SE algorithm outperforms FOLD-R++ as well as other rule-learning algorithms such as RIPPER in efficiency, performance and scalability, especially for large datasets. A major reason for scalable explainability of FOLD-SE is the use of a literal selection heuristics based on Gini Impurity, as opposed to Information Gain used in FOLD-R++. A multi-category classification version of FOLD-SE is also presented.
arXiv.org Artificial Intelligence
Jan-10-2023
- Country:
- Asia > Japan
- Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Europe > United Kingdom
- England
- Cambridgeshire > Cambridge (0.04)
- Greater London > London (0.04)
- England
- North America > United States
- California > San Francisco County
- San Francisco (0.14)
- Colorado > Denver County
- Denver (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Texas (0.04)
- California > San Francisco County
- Oceania > Australia (0.04)
- Asia > Japan
- Genre:
- Research Report (1.00)
- Technology: