Transforming Feature Space to Interpret Machine Learning Models
Interpreting complex nonlinear machine-learning models is an inherently difficult task. A common approach is the post-hoc analysis of black-box models for dataset-level interpretation (Murdoch et al. 2019) using model-agnostic techniques such as the permutation-based variable importance, and graphical displays such as partial dependence plots that visualize main effects while integrating over the remaining dimensions (Molnar, Casalicchio, and Bischl 2020). These tools are so far limited to displaying the relationship between the response and one (or sometimes two) predictor(s), while attempting to control for the influence of the other predictors. This can be rather unsatisfactory when dealing with a large number of highly correlated predictors, which are often semantically grouped. While the literature on explainable machine learning has often focused on dealing with dependencies affecting individual features, e.g. by introducing conditional diagnostics (Strobl et al. 2008; Molnar, König, Bischl, et al. 2020), no practical solutions are available yet for dealing with model interpretation in highdimensional feature spaces with strongly dependent features (Molnar, Casalicchio, and Bischl 2020; Molnar, König, Herbinger, et al. 2020). These situations routinely occur in environmental remote sensing and other geographical and ecological analyses (Landgrebe 2002; Zortea, Haertel, and Clarke 2007), which motivated the present proposal to enhance existing model interpretation tools by offering a new, transformed perspective. For example, vegetation'greenness' as a measure of photosynthetic activity is often used to classify landcover or land use from satellite imagery acquired at multiple time points throughout the growing season (Peña and Brenning 2015; Peña, Liao, and Brenning 2017). Spectral reflectances of equivalent spectral bands (the features) are usually strongly correlated within the same phenological stage since vegetation characteristics vary gradually.
Apr-9-2021