Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

Chen, Jianbo, Song, Le, Wainwright, Martin J., Jordan, Michael I.

arXiv.org Artificial Intelligence 

Interpretability is an extremely important criterion when a machine learning model is applied in areas such as medicine, financial markets, and criminal justice (e.g., see the discussion paper by Lipton ([18]), as well as references therein). Many complex models, such as random forests, kernel methods, and deep neural networks, have been developed and employed to optimize prediction accuracy, which can compromise their ease of interpretation. In this paper, we focus on instancewise feature selection as a specific approach for model interpretation. Given a machine learning model, instancewise feature selection asks for the importance scores of each feature on the prediction of a given instance, and the relative importance of each feature are allowed to vary across instances. Thus, the importance scores can act as an explanation for the specific instance, indicating which features are the key for the model to make its prediction on that instance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found