Towards Pareto Optimal Throughput in Small Language Model Serving