Accelerating Transformer Inference and Training with 2:4 Activation Sparsity