Memorization Capacity of Multi-Head Attention in Transformers