A Learning Theoretic Perspective on Local Explainability

Li, Jeffrey, Nagarajan, Vaishnavh, Plumb, Gregory, Talwalkar, Ameet

Nov-2-2020–arXiv.org Machine Learning

In this paper, we explore connections between interpretable machine learning and learning theory through the lens of local approximation explanations. First, we tackle the traditional problem of performance generalization and bound the testtime accuracy of a model using a notion of how locally explainable it is. Second, we explore the novel problem of explanation generalization which is an important concern for a growing class of finite sample-based local approximation explanations. Finally, we validate our theoretical results empirically and show that they reflect what can be seen in practice. There has been a growing interest in interpretable machine learning, which seeks to help people understand their models. While interpretable machine learning encompasses a wide range of problems, it is a fairly uncontroversial hypothesis that there exists a tradeoff between a model's complexity and general notions of interpretability. This hypothesis suggests a seemingly natural connection to the field of learning theory, which has thoroughly explored relationships between a function class's complexity and generalization. However, formal connections between interpretability and learning theory remain relatively unstudied.

artificial intelligence, explanation, machine learning, (19 more...)

arXiv.org Machine Learning

Nov-2-2020

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)
- North America > United States (0.28)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.74)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found