Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping