Beyond Scalar Reward Model: Learning Generative Judge from Preference Data