Feature construction using explanations of individual predictions
Vouk, Boštjan, Guid, Matej, Robnik-Šikonja, Marko
–arXiv.org Artificial Intelligence
Feature construction can contribute to comprehensibility and performance of machine learning models. Unfortunately, it usually requires exhaustive search in the attribute space or time-consuming human involvement to generate meaningful features. We propose a novel heuristic approach for reducing the search space based on aggregation of instance-based explanations of predictive models. The proposed Explainable Feature Construction (EFC) methodology identifies groups of co-occurring attributes exposed by popular explanation methods, such as IME and SHAP. We empirically show that reducing the search to these groups significantly reduces the time of feature construction using logical, relational, Cartesian, numerical, and threshold num-of-N and X-of-N constructive operators. An analysis on 10 transparent synthetic datasets shows that EFC effectively identifies informative groups of attributes and constructs relevant features. Using 30 real-world classification datasets, we show significant improvements in classification accuracy for several classifiers and demonstrate the feasibility of the proposed feature construction even for large datasets. Finally, EFC generated interpretable features on a real-world problem from the financial industry, which were confirmed by a domain expert.
arXiv.org Artificial Intelligence
Jan-23-2023
- Country:
- Asia > Vietnam (0.04)
- North America > United States
- Massachusetts > Suffolk County > Boston (0.04)
- Europe
- United Kingdom > England (0.04)
- Slovenia > Central Slovenia
- Municipality of Ljubljana > Ljubljana (0.04)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (0.67)
- Research Report
- Industry:
- Education (0.92)
- Banking & Finance > Credit (0.68)
- Transportation > Air (0.67)
- Leisure & Entertainment > Games (0.46)
- Health & Medicine > Therapeutic Area
- Oncology (0.68)
- Technology:
- Information Technology
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Cognitive Science > Problem Solving (0.69)
- Representation & Reasoning
- Search (0.87)
- Expert Systems (0.67)
- Machine Learning
- Decision Tree Learning (0.68)
- Neural Networks (0.67)
- Performance Analysis > Accuracy (0.48)
- Statistical Learning > Support Vector Machines (0.46)
- Information Technology