S 3 : Increasing GPU Utilization during Generative Inference for Higher Throughput