You can remove GPT2's LayerNorm by fine-tuning