Unraveling the Gradient Descent Dynamics of Transformers