Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems