A Benchmark for Interpretability Methods in Deep Neural Networks