The Price of Interpretability
Bertsimas, Dimitris, Delarue, Arthur, Jaillet, Patrick, Martin, Sebastien
When quantitative models are used to support decision-making on complex and important topics, understanding a model's ``reasoning'' can increase trust in its predictions, expose hidden biases, or reduce vulnerability to adversarial attacks. However, the concept of interpretability remains loosely defined and application-specific. In this paper, we introduce a mathematical framework in which machine learning models are constructed in a sequence of interpretable steps. We show that for a variety of models, a natural choice of interpretable steps recovers standard interpretability proxies (e.g., sparsity in linear models). We then generalize these proxies to yield a parametrized family of consistent measures of model interpretability. This formal definition allows us to quantify the ``price'' of interpretability, i.e., the tradeoff with predictive accuracy. We demonstrate practical algorithms to apply our framework on real and synthetic datasets.
Jul-8-2019
- Country:
- Genre:
- Research Report (0.50)
- Industry:
- Education (0.68)
- Government (0.87)
- Health & Medicine > Therapeutic Area (0.48)
- Information Technology > Security & Privacy (0.66)
- Technology: