Addressing Both Statistical and Causal Gender Fairness in NLP Models