Inference-Friendly Models With MixAttention