Max-Margin Token Selection in Attention Mechanism