Understanding and Optimizing Multi-Stage AI Inference Pipelines