Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models?