How Transformers Implement Induction Heads: Approximation and Optimization Analysis