Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models