Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs