Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts