Causal Inference Tools for a Better Evaluation of Machine Learning