Transformer Learns Optimal Variable Selection in Group-Sparse Classification