PanGu-$\pi$: Enhancing Language Model Architectures via Nonlinearity Compensation