Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling