Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization