A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation