SRMIR: Shadow Reward Models Based on Introspective Reasoning for LLM Alignment