Transformers as Measure-Theoretic Associative Memory: A Statistical Perspective and Minimax Optimality