SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM