Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models