Sparse Iso-FLOP Transformations for Maximizing Training Efficiency