AI Accelerators for Large Language Model Inference: Architecture Analysis and Scaling Strategies