Improving uplift model evaluation on RCT data