Learning Guarantee of Reward Modeling Using Deep Neural Networks